> -----Original Message-----
> From: Discussion of advanced .NET topics. [mailto:ADVANCED-
> [EMAIL PROTECTED] On Behalf Of Michael Sharpe
> Sent: Tuesday, February 05, 2008 6:04 PM
> To: ADVANCED-DOTNET@DISCUSS.DEVELOP.COM
> Subject: Re: [ADVANCED-DOTNET] Join/Merge multiple files together
> 
> There are several assumptions in your statement that are not accurate:
> 
> 1.  different clients are *already* writing to a single networked file
> system
> This is not true.  We have a final process called our Merge that takes
> the individual pieces and recombines them into a new file.  

It sounds like they *are* writing to the same file system (the file
server), but not to the same files. If they are writing to different
file systems, and then being copied to a file server - then you already
have your hooking point. Just do the copy in code and stitch it together
then.

> File Part1 could be 5K and file Part2 could be 1MB.
> However, I can guarantee that Part2 belongs directly after Part1.  The
> actual data inside of them could differ based on many numerous factors
> during analysis

I was afraid of that wrinkle. If they're not known sizes, then
offsetting would probably not work (unless your reader app can magically
deal with large blocks of null data in its files). Is it possible to
quickly calculate the part sizes before analyzing? 
 
> Lustre is one example that can do this.  Our internal version of
> OpenVMS running on both Alpha and Itanium can also do this.  We are
looking for
> something that can be ran using more common O/S like Windows and
Linux. 

Export a share from the OpenVMS box and write to it instead of your
fileserver, and shell script the merge. I'd imagine that beats the hell
out of running Lustre, which basically strikes me as a research project.
Though they did say they're working on a Win32 port of the client....

But - since you brought up Linux as an option - don't forget to check
out FUSE http://fuse.sourceforge.net/ - a userland file system. I don't
know of an already written driver that meets your needs (check
http://fuse.sourceforge.net/wiki/index.php/FileSystems) but it's not
difficult to write a barebones implementation. I'm assuming again -
since I don't have the necessary details - but I'd imagine it'd go
something like this:

1. Instead of writing to \\windows_fileserver\share, your distributed
clients would write to \\linux_samba_fileserver\share the same as they
do currently (multiple files).
2. When done, somebody writes
\\linux_samba_fileserver\share\bigfileindex.dat which gives the file
part names and order.
3. Instead of reading from \\windows_fileserver\share, your reading app
reads from \\linux_samba_fileserver\union_share, which is mounted with a
FUSE driver that unions the file parts together.

As a last resort, if you are going to go the route of mucking with
allocation tables - I'd strongly suggest staying the hell away from
NTFS. That's not a file system you want to be screwing around with. I'm
quite sure that ext2/3 or FAT32 are a lot more stable and documented
than NTFS.

--Mark Brackett

===================================
This list is hosted by DevelopMentorĀ®  http://www.develop.com

View archives and manage your subscription(s) at http://discuss.develop.com

Reply via email to