> As the next step, I tried to upload a file (32472771 bytes) through > the web frontend, which resulted in two issues:
Yeah, as Zooko pointed out, the Tahoe default is to upload everything as an immutable file, which can be as large as 12GB (and we're a few code changes away from raising that limit into the exabyte range). Mutable files exist mainly to support directories, so we haven't yet finished the coding necessary to support large mutable files. The current limit of 3.5MB is somewhat arbitrary, but it accomplishes a couple of useful goals (reasonable alacrity, easy enough to implement quickly). Some day we'll have larger mutable files, but since 3.5MB is enough for a directory with tens of thousands of entries, it hasn't been a high priority so far. > * When is `ps axflwww'ed the process' memory usage, I saw that the > python instance that ran the connection where I uploaded the 31MB > file grew beyond 300MB of VSZ. That sounds like a bug in the code that's rejecting the too-large mutable file. Which version were you running? (1.2.0 or current trunk?). If it was current trunk, I'll look more closely at the problem. We've had runaway processes happen before when some bit of error-handling code got confused inside a loop. In fact, I think I remember a ticket about this, so I suspect it's been fixed in trunk. > I better keep my brain away from > thinking about uploading a gigabyte sized file on a 32bit > system... Oh, GB-sized *immutable* files work just fine. In fact we have over 800 files 1GB or larger on our production network right now, and 87 files in the 3GB-10GB range. It took their owners a long time to upload them, but as far as we can tell, the uploads succeeded. > And during create-client and create-introducer, an empty directory is > required. It would be nice to ignore "lost+found" while looking up > directory contents... Well, the idea is that 'tahoe create-client' creates a new directory for you. That way we can be sure that the directory will be empty. The Tahoe process wants to own its base directory: sometimes it will delete things inside it, and various files inside that directory will control the Tahoe node's configuration. By using a brand-new directory, we don't have to worry about 1) accidentally deleting some existing file that the user cares about, and 2) how some pre-existing file might affect the node's behavior. In the current trunk, we've moved most configuration settings into a single 'tahoe.cfg' INI-style file, but a number of discrete files are still used for backwards compability. For exaple, a file named 'sizelimit' basically controls how much space the storage server is allowed to use. We don't happen to use a config variable named "lost+found", but if we did, then allowing some unrelated file or directory by that name to be present in the tahoe basedir would change the behavior of the tahoe node in surprising ways. So I'm inclined to continue to encourage users to have a dedicated directory for their Tahoe node (by having 'tahoe create-client' create a brand-new directory for them). It sounds like you have a dedicated partition for your Tahoe node.. that's great. Just use 'tahoe create-client /newpartition/tahoe', and let the node have its own dedicated directory as well. cheers, -Brian _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
