Re: superlifter design notes (OpenVMS perspective)
On Tue, Jul 30, 2002 at 12:00:21AM -0400, John E. Malmberg wrote: To help explain why the backup and file distribution have such different implementation issues, let me give some background. This is a dump of an OpenVMS native text file. This is the format that virtually all text editors produce on it. Dump of file PROJECT_ROOT:[rsync_vms]CHECKSUM.C_VMS;1 on 29-JUL-2002 22:02:21.32 File ID (118449,3,0) End of file block 8 / Allocated 8 Virtual block number 1 (0001), 512 (0200) bytes 67697279 706F4320 20200025 2A2F0002 ../*%. Copyrig 00 72542077 6572646E 41202943 28207468 ht (C) Andrew Tr 10 20200024 00363939 31206C6C 65676469 idgell 1996.$. 20 50202943 28207468 67697279 706F4320 Copyright (C) P 30 39312073 61727265 6B63614D 206C7561 aul Mackerras 19 40 72702073 69685420 20200047 3639 96..G. This pr 50 Each record is preceded by a 16 bit count of how long the record is. While any value can be present in a record, ususally only printable ASCII is usually present. The file must be open in binary mode. On an fopen() call, the b mode qualifier causes the file to be opened in binary mode, so no translation is done. This has no effect on UNIX, but it is important on other file platforms. This flag is documented as part of the ISO C standard, but has no effect on a UNIX platform. While VMS and a few other OSs make the distinction between text and binary files the VMS is fairly unique. UNIX is our primary focus and i don't intend to get bogged down with OS specifics on all platforms. POSIX has no mechanism for determining the content of files. All files are binary. To meet your record-oriented text file needs i would say that the VMS port would need to have a options and extra logic. For backups all file could be opened with the b mode qualifier. For sending to non-VMS systems text files would want conversion to another format, and for receiving some heuristics would identify text files for conversion (updating text files could take advantage of the local file's attributes). Such file conversions would require in-core translation for checksums, file length and change merges. This puts them into the same category as unix2dos text-file conversions and backup compression. Such file conversions are outside the scope of current consideration but where possible we should keep them in mind for future enhancement. Then there are the file attributes: CHECKSUM.C_VMS;1 File ID: (118449,3,0) Size:8/8 Owner:[SYSOP,MALMBERG] Created: 29-JUL-2002 22:01:37.95 Revised: 29-JUL-2002 22:01:38.01 (1) Expires: None specified Backup:No backup recorded Effective: None specified Recording: None specified File organization: Sequential Shelved state: Online Caching attribute: Writethrough File attributes:Allocation: 8, Extend: 0, Global buffer count: 0 No version limit Record format: Variable length, maximum 0 bytes, longest 71 bytes Record attributes: Carriage return carriage control RMS attributes: None Journaling enabled: None File protection:System:RWED, Owner:RWED, Group:RWED, World:RE Access Cntrl List: None Client attributes: None And this is for a simple file format. Files can be indexed or have multiple keys. And there is no cross platform API for retrieving all of these attributes, so how do you determine how to transmit them through? We can't rely on a pre-existing cross-platform API. What I'm inclined toward is to use native I/O routines. The protocol would be focused on UNIX file semantics. We might add a few reasonable additional bits for those platforms that will be VERY common interoperators. These other attributes i would treat as special extended attributes. Security is another issue: In some cases the binary values for the access control entries needs to preserved, and in other cases, the text values need to be preserved. It also may need a translation from one set of text or binary values to another set. And again, there are no cross platform API's for returning this information. See above. We need to support binary IDs and text IDs and ID squashing. I'm not sure yet but mode bits will probably be binary. There is no reason to transmit them as text. So a backup type application is going to have to have a lot of platform specific tweaks, and some way to pass all this varied information between the client and server. As each platform is added, an extension may need to be developed. Platform specific tweaks will only be built into the binaries for that platform. The protocol will have certain UNIX centricities but the flexibility to transmit platform specifics. A server definitely needs to know if it is in backup mode as opposed to file distribution mode. In file distribution mode, only a few file attributes need to be preserved, and a loss of
Re: superlifter design notes (OpenVMS perspective)
JS == jw schultz [EMAIL PROTECTED] wrote the following on Sat, 27 Jul 2002 23:05:50 -0700 JS As a poor example let us suppose that a filename contained a JS /. A UNIX system using translation might turn this into _. JS Escapement might turn it into =2F and = into =3D. rdiff-backup has this feature. I'm not sure anyone uses it, and it was a pain to add and to test adequately, especially when the additional quoting characters push the length of the filename over the limit. If I had to do it over I probably would have skipped this feature (or at least wait until lots of people bothered me about it). -- Ben Escoto msg04693/pgp0.pgp Description: PGP signature
Re: superlifter design notes (OpenVMS perspective)
On 27 Jul 2002, jw schultz [EMAIL PROTECTED] wrote: The server has no need to deal with cleint limitations. I am saying that the protocol would make the bare minimum of limitatons (null termination, no nulls in names). It probably also makes sense to follow NFS4 in representing paths as a vector of components, rather than as a single string with '/'s in it or whatever. ['home', 'mbp', 'work', 'rsync'] avoids any worries about / vs \ vs :, and just lets the client do whatever makes sense. I don't know a lot about i18n support, but it does seem that programs will need to know what encoding to use for the filesystem on platforms that are not natively Unicode. On Unix it probably makes sense to default to UTF-8, but latin-1 or others are equally likely. This is independent of the choice of message locale. I think the W32 APIs are defined in Unicode so we don't need to worry. Quoting, translating, or rejecting illegal characters could all make sense depending on context. I guess I see John's backup vs distribution question as hopefully being different profiles or wrappers around a single codebase, rather than different programs. Perhaps the distinction he's getting at is whether the audience for the client who uploaded the data is the same client, or somebody else? -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (OpenVMS perspective)
On Sun, Jul 28, 2002 at 05:39:22PM +1000, Martin Pool wrote: On 27 Jul 2002, jw schultz [EMAIL PROTECTED] wrote: The server has no need to deal with cleint limitations. I am saying that the protocol would make the bare minimum of limitatons (null termination, no nulls in names). It probably also makes sense to follow NFS4 in representing paths as a vector of components, rather than as a single string with '/'s in it or whatever. ['home', 'mbp', 'work', 'rsync'] avoids any worries about / vs \ vs :, and just lets the client do whatever makes sense. That is _one_ of the reasons that i said filenames should be CWD relative (no path components). That way the protocol never needs to know about / vs \ with the possible exception of links. The vector component list would address the isue of link destinations nicely with a null terminated list of null terminated strings 'home\0mbp\0work\0\rsync\0\0'. I don't know a lot about i18n support, but it does seem that programs will need to know what encoding to use for the filesystem on platforms that are not natively Unicode. On Unix it probably makes sense to default to UTF-8, but latin-1 or others are equally likely. This is independent of the choice of message locale. I think the W32 APIs are defined in Unicode so we don't need to worry. Quoting, translating, or rejecting illegal characters could all make sense depending on context. I avoided the idea of rejection but there may be cases where we need it. Rejection would mean the file would not be transfered. For interactive use the default would be to translate and would seldom be used because most transfers would be of filesnames supported on both ends. Any time a translation occurs a warning would be generated unless silenced. I guess I see John's backup vs distribution question as hopefully being different profiles or wrappers around a single codebase, rather than different programs. Perhaps the distinction he's getting at is whether the audience for the client who uploaded the data is the same client, or somebody else? The backup vs. distribution question seems to hang on what we do when the storage semantics of the two nodes have a mismatch. For backups we want to retain all data either through lossless conversion or in some kind of meta-data store. I'm inclined to take advantage of extended attributes (NAME=rsync_perms etc. ?) But for distribution we can afford some meta-data loss as long as future runs will compare correctly (ignore the loss). I agree with you, this is either a different wrapper or perhaps a mode setting multiple options. The biggest difference seems to be on the server so perhaps the same codebase might generate a server that has additional capabilities but the client for both would be the same regardless. -- J.W. SchultzPegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (OpenVMS perspective)
Lenny Foner [EMAIL PROTECTED] wrote: jw schultz wrote: I find the use of funny chars (including space) in filenames offensive but we need to deal with internationalizations and sheer stupidity. Regardless of what you think about them, MacOS comes with pathnames containing spaces right out of the box (think System Folder). Yes, rsync needs to not make assumptions about what's legal in a filename. Some OS's think slashes are path separators; some put them inside individual filenames. Some think [] are separators. We shouldn't try to make any assumptions. Agreed. For a file distribution program, for each file to be transferred, ideally the server will have a list of how the file should be represented on platforms that the server knows about. The client would be able to tell the server about new platforms, but the server would not be required to remember the information if it did not trust the client. As I work through my back log of e-mail messages, I will give some possible implemention details as answers to other posts. -John [EMAIL PROTECTED] Personal Opinion Only -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (OpenVMS perspective)
Martin Pool wrote: On 22 Jul 2002, John E. Malmberg [EMAIL PROTECTED] wrote: A clean design allows optimization to be done by the compiler, and tight optimization should be driven by profiling tools. Right. So, for example, glib has a very smart assembly ntohl() and LZO is tight code. I would much rather use them than try to reduce the byte count by a complicated protocol. Many compilers will inline ntohl() giving the call very low overhead. 5. Similarly, no silly tricks with forking, threads, or nonblocking IO: one process, one IO. Forking or multiple processes can be high cost on some platforms. I am not experienced with Posix threads to judge their portability. But as long as it is done right, non-blocking I/O is not a problem for me. If you structure the protocol processing where no subroutine ever posts a write and then waits for a read, you can set up a library that can be used either blocking or non-blocking. Yes, that's how librsync is structured. Is it reasonable to assume that some kind of poll/select arrangement is available everywhere? In other words, can I check to see if input is available from a socket without needing to block trying to read from it? I can poll, but I prefer to cause the I/O completion to trigger a completion routine. But that is not portable. :-) I would hope that only a relatively small layer needs to know about how and when IO is scheduled. It will make callbacks (or whatever) to processes that produce and consume data. That layer can be adapted, or if necessary, rewritten, to use whatever async IO features are available on the relevant platform. Test programs that internally fork() are very troublesome for me. Starting a few hundred individually by a script are not. If we always use fork/exec (aka spawn()) is that OK? Is it only processes that fork and that then continue executing the same program that cause trouble? Mainly. I can deal with spawn() much easier than fork() I can only read UNIX shell scripts of minor complexity. Apparently Python runs on VMS. I'm in favour of using it for the test suite; it's much more effective than sh. Unfortunately the Python maintainer for VMS retired, and I have not been able to figure out how to get his source to compile. I have got the official Python to compile and link with only having to fix one severe programming error. However it still is not running. I am isolating where the problem is in my free time. 12. Try to keep the TCP pipe full in both directions at all times. Pursuing this intently has worked well in rsync, but has also led to a complicated design prone to deadlocks. Deadlocks can be avoided. Do you mean that in the technical sense of deadlock avoidance? i.e. checking for a cycle of dependencies and failing? That sounds undesirably complex. No by not using a complex protocol, so that there are no deadlocks. 9 Model files as composed of a stream of bytes, plus an optional table of key-value attributes. Some of these can be distinguished to model ownership, ACLs, resource forks, etc. Not portable. This will effectively either exclude all non-UNIX or make it very difficult to port to them. Non-UNIX is not completely fair; as far as I know MacOS, Amiga, OS/2, Windows, BeOS, and QNX are {byte stream + attributes + forks} too. I realize there are platforms which are record-oriented, but I don't have much experience on them. How would the rsync algorithm even operate on such things? Record files need to be transmitted on record boundaries, not arbitrary boundaries. Also random access can not be used. The file segments need to be transmitted in order. For a UNIX text file, a record is a line of text deliminated by the line-feed character. [This is turned out to be a big problem in porting SAMBA. An NT client transfers a large file by sending 64K, skipping 32K, sending some more and then sending the 32K later. Samba does not do this, so the resulting corruption of a record structured file did not show up in the initial testing. I still have not found the ideal fix for SAMBA, but implemented a workaround.] Is it sufficient to model them as ascii+linefeeds internally, and then do any necessary translation away from that model on IO? Yes as long as no partial records are transmitted. Partial records can be a problem. If I now the rest of the record is coming, then I can wait for it, but if the rest of the record is going to be skipped, then it takes more work. BINARY files are no real problem. The binary is either meaningful on the client or server or it is not. However file attributes may need to be maintained. If the file attributes are maintained, it would be possible for me to have a OpenVMS indexed file moved up to a UNIX server, and then back to another OpenVMS system and be usuable. Possibly it would be nice to have a way to stash attributes that cannot be represented on the
Re: superlifter design notes (OpenVMS perspective)
Qualities 1. Be reasonably portable: at least in principle, it should be possible to port to Windows, OS X, and various Unixes without major changes. In general, I would like to see OpenVMS in that list. Principles 1. Clean design rather than micro-optimization. A clean design allows optimization to be done by the compiler, and tight optimization should be driven by profiling tools. 4. Keep the socket open until the client gets bored. (Avoids startup time; good for on-line mirroring; good for interactive clients.) I am afraid I do not quite understand this one. Are you refering to a server waiting for a reconnect for a while instead of reconnecting? If so, that seems to be a standard behavior for network daemons. 5. Similarly, no silly tricks with forking, threads, or nonblocking IO: one process, one IO. Forking or multiple processes can be high cost on some platforms. I am not experienced with Posix threads to judge their portability. But as long as it is done right, non-blocking I/O is not a problem for me. If you structure the protocol processing where no subroutine ever posts a write and then waits for a read, you can set up a library that can be used either blocking or non-blocking. The same for file access. On OpenVMS, I can do all I/O in a non-blocking manor. The problem is that I must use native I/O calls to do so. If the structure is that after any I/O, control returns to a common point for the next step in the protocol, then it is easy to move from a blocking implementation to a non-blocking one. MACROs can probably be used to allow common code to be used for blocking or non-blocking implementations. Two systems that use non-blocking mode can push a higher data rate through the same time period. This is an area where I can offer help to produce a clean implementation. One of the obstacles to me cleanly implementing RSYNC as a single process is when a subroutine is waiting for a response to a command that it sent. If that subroutine is called from as an asynchronous event, it blocks all other execution in that process. That same practice hurts in SAMBA. 8. Design for testability. For example: don't rely on global resources that may not be available when testing; do make behaviours deterministic to ease testing. Test programs that internally fork() are very troublesome for me. Starting a few hundred individually by a script are not. I can only read UNIX shell scripts of minor complexity. 10. Have a design that is as simple as possible. 11. Smart clients, dumb servers. This is claimed to be a good design pattern for internet software. rsync at the moment does not really adhere to it. Part of the point of rsync is that having a smarter server can make things much more efficient. A strength of this approach is that to add features, you (often) only need to add them to the client. It should be a case of who can do the job easier. 12. Try to keep the TCP pipe full in both directions at all times. Pursuing this intently has worked well in rsync, but has also led to a complicated design prone to deadlocks. Deadlocks can be avoided. Make sure if an I/O is initiated, that the next step is to return to the protocol dispatching routine. General design ideas 9 Model files as composed of a stream of bytes, plus an optional table of key-value attributes. Some of these can be distinguished to model ownership, ACLs, resource forks, etc. Not portable. This will effectively either exclude all non-UNIX or make it very difficult to port to them. Binary files are a stream of bytes. Text files are a stream of records. Many systems do not store text files as a stream of bytes. They may or may not even be ASCII. If you are going to maintain meta files for ACLs and Resource Forks. Then there should be some provision to supply attributes for an entire directory or individual files. BINARY files are no real problem. The binary is either meaningful on the client or server or it is not. However file attributes may need to be maintained. If the file attributes are maintained, it would be possible for me to have a OpenVMS indexed file moved up to a UNIX server, and then back to another OpenVMS system and be usuable. Currently in order to do so, I must encapsulate them in a .ZIP archive. That is .ZIP, not GZIP or BZIP. On OpenVMS those are only useful to transfer source and a limited subset of binaries. TEXT files are much different than binary files, except on UNIX. A text file needs to be processed by records, and on many systems can not have the records updated randomly, or if they do it is not real efficient. If a target use for this program is to be for assisting in cross platform open source synchronization, then it really needs to properly address the text files. A server should know how to represent a TEXT file in a portable format to the client. Stream records in ASCII, delimited
Re: superlifter design notes (OpenVMS perspective)
On 22 Jul 2002, John E. Malmberg [EMAIL PROTECTED] wrote: Qualities 1. Be reasonably portable: at least in principle, it should be possible to port to Windows, OS X, and various Unixes without major changes. In general, I would like to see OpenVMS in that list. Yes, OpenVMS, perhaps also QNX and some other TCP/IP-capable RTOSs. Having a portable protocol is a bit more important than a portable implementation. I would hope that with a new system, even if the implementation was unix-bound, you would at least be able to write a new client, reusing some of the code, that worked well on ITS. A clean design allows optimization to be done by the compiler, and tight optimization should be driven by profiling tools. Right. So, for example, glib has a very smart assembly ntohl() and LZO is tight code. I would much rather use them than try to reduce the byte count by a complicated protocol. 4. Keep the socket open until the client gets bored. (Avoids startup time; good for on-line mirroring; good for interactive clients.) I am afraid I do not quite understand this one. Are you refering to a server waiting for a reconnect for a while instead of reconnecting? What I meant is that I would like to be able to open a connection to a server, download a file, leave the connection open, decide I need another file, and then get that one too. You can do this with FTP, and (kindof) HTTP, but not rsync, which needs to know the command up front. Of course the server can drop you too by a timeout or whatever. If so, that seems to be a standard behavior for network daemons. 5. Similarly, no silly tricks with forking, threads, or nonblocking IO: one process, one IO. Forking or multiple processes can be high cost on some platforms. I am not experienced with Posix threads to judge their portability. But as long as it is done right, non-blocking I/O is not a problem for me. If you structure the protocol processing where no subroutine ever posts a write and then waits for a read, you can set up a library that can be used either blocking or non-blocking. Yes, that's how librsync is structured. Is it reasonable to assume that some kind of poll/select arrangement is available everywhere? In other words, can I check to see if input is available from a socket without needing to block trying to read from it? I would hope that only a relatively small layer needs to know about how and when IO is scheduled. It will make callbacks (or whatever) to processes that produce and consume data. That layer can be adapted, or if necessary, rewritten, to use whatever async IO features are available on the relevant platform. Test programs that internally fork() are very troublesome for me. Starting a few hundred individually by a script are not. If we always use fork/exec (aka spawn()) is that OK? Is it only processes that fork and that then continue executing the same program that cause trouble? I can only read UNIX shell scripts of minor complexity. Apparently Python runs on VMS. I'm in favour of using it for the test suite; it's much more effective than sh. 12. Try to keep the TCP pipe full in both directions at all times. Pursuing this intently has worked well in rsync, but has also led to a complicated design prone to deadlocks. Deadlocks can be avoided. Do you mean that in the technical sense of deadlock avoidance? i.e. checking for a cycle of dependencies and failing? That sounds undesirably complex. Make sure if an I/O is initiated, that the next step is to return to the protocol dispatching routine. 9 Model files as composed of a stream of bytes, plus an optional table of key-value attributes. Some of these can be distinguished to model ownership, ACLs, resource forks, etc. Not portable. This will effectively either exclude all non-UNIX or make it very difficult to port to them. Non-UNIX is not completely fair; as far as I know MacOS, Amiga, OS/2, Windows, BeOS, and QNX are {byte stream + attributes + forks} too. I realize there are platforms which are record-oriented, but I don't have much experience on them. How would the rsync algorithm even operate on such things? Is it sufficient to model them as ascii+linefeeds internally, and then do any necessary translation away from that model on IO? BINARY files are no real problem. The binary is either meaningful on the client or server or it is not. However file attributes may need to be maintained. If the file attributes are maintained, it would be possible for me to have a OpenVMS indexed file moved up to a UNIX server, and then back to another OpenVMS system and be usuable. Possibly it would be nice to have a way to stash attributes that cannot be represented on the destination filesystem, but perhaps that is out of scope. I recall seeing a comment somewhere in this thread about timestamps being left to 16 bits. No, 32 bits. 16 bits is obviously silly. File timestamps
Re: superlifter design notes (OpenVMS perspective)
User-Agent: Mozilla/5.0 (X11; U; OpenVMS COMPAQ_AlphaServer_DS10_466_MHz; en-US; rv:1.1a) Gecko/20020614 If something as complex as Mozilla can run on OpenVMS then I guess we really have no excuse :-) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (OpenVMS perspective)
On Mon, Jul 22, 2002 at 03:34:37PM +1000, Martin Pool wrote: On 22 Jul 2002, John E. Malmberg [EMAIL PROTECTED] wrote: If you structure the protocol processing where no subroutine ever posts a write and then waits for a read, you can set up a library that can be used either blocking or non-blocking. Yes, that's how librsync is structured. Is it reasonable to assume that some kind of poll/select arrangement is available everywhere? In other words, can I check to see if input is available from a socket without needing to block trying to read from it? I think we can assume that any platform supporting POSIX I/O semantics will be sufficient. I would hope that only a relatively small layer needs to know about how and when IO is scheduled. It will make callbacks (or whatever) to processes that produce and consume data. That layer can be adapted, or if necessary, rewritten, to use whatever async IO features are available on the relevant platform. That is the better approach. Use I/O routines so most processing can be while (get_input()) { process(); send_output()} Then the I/O routines can be defined accorinding to platform. [snip] 9 Model files as composed of a stream of bytes, plus an optional table of key-value attributes. Some of these can be distinguished to model ownership, ACLs, resource forks, etc. Not portable. This will effectively either exclude all non-UNIX or make it very difficult to port to them. Non-UNIX is not completely fair; as far as I know MacOS, Amiga, OS/2, Windows, BeOS, and QNX are {byte stream + attributes + forks} too. I realize there are platforms which are record-oriented, but I don't have much experience on them. How would the rsync algorithm even operate on such things? Is it sufficient to model them as ascii+linefeeds internally, and then do any necessary translation away from that model on IO? BINARY files are no real problem. The binary is either meaningful on the client or server or it is not. However file attributes may need to be maintained. If the file attributes are maintained, it would be possible for me to have a OpenVMS indexed file moved up to a UNIX server, and then back to another OpenVMS system and be usuable. If a platform has some special type of file it would be responsible for converting to/from a multi-segment bytestream. By multi-segement bytestream i mean a sequence of binary_data blocks having an offset and length. In this way we have the potential to deal with sparse files and to packetize the transfers of large files. Obviously offset and size are 64bit. Possibly it would be nice to have a way to stash attributes that cannot be represented on the destination filesystem, but perhaps that is out of scope. In general what we have to expect is that we can only transfer the lowest common denominator of file attributes. It would be possible to build a server that didn't depend on local filesystem semantics and so could support an attribute superset. But that is out of scope for now. File timestamps for OpenVMS and for Windows NT are in 64 bits, but use different base dates. I think we should use something like 64-bit microseconds-since-1970, with a precision indicator. File attributes need to be stored somewhere, so a reserved directory or filename convention will need to be used. I assume that there will be provisions for a server to be marked as a master reference. What do you mean master reference? See my super/subset comment above. For flexability, a client may need to provide filename translation, so the original filename (that will be used on the wire) should be stored as a file attribute. It also follows that it probably is a good idea to store the translated filename as an attribute also. Can you give us an example? Are you talking about things like managing case-insensitive systems? Filenames should be null terminated UTF-8. If a given platform cannot support the port to that platform will be responsible for conversion. We probably should designate an inline subroutine for filename converson. The only alternative would be to restrict filenames to ascii [-_.A-Za-z0-9] or something similarly restrictive. I find the use of funny chars (including space) in filenames offensive but we need to deal with internationalizations and sheer stupidity. -- J.W. SchultzPegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html