Jan Harkes napsal(a):
On Wed, Jun 27, 2007 at 10:36:35AM -0700, Yan Seiner wrote:
I'm trying to build a pair of coda "appliances" - basically embedded
boxes with a VPN and coda server/client, each acting as a samba server
to its network. The goal is to have two identical replicas of the same
data.
One side would be the server, the other would be the client. Otherwise
the boxes would be identical.
I've got coda built and installed, and now I'm trying to map out my
approach.
The hardware consists of a 200 MHz ARM CPU with 32 MB of RAM. The data
consists of approximately 300 GB of CAD files.
Is this enough RAM? Can the RVM metadata be kept in a swap partition or
do I need physical RAM for it?
Sounds like your hardware is in the same ballpark as the linksys NSLU I
have at home. I guess it 'could' run a server, but I really haven't
tried.
Pretty close. I can get the hardware with up to 128 MB RAM if needed.
The metadata is VM backed, so having swap space is definitely useful,
the server doesn't really care about physical ram except that swapping
will slow it down, which in turn would cause the client to switch to
disconnected or weakly connected operation even over a well connected
network.
We use a private mmap of the RVM data file, so in low memory situations,
clean in-memory pages are simply discarded and paged back in. Dirty
pages are written to swap.
A problem with your setup is that the box that runs the server will also
have to run a client in order to provide local access. But that will
mean that the meta-data is cached both by the server as well as the
client and possible some in the kernel and in the samba daemon. And I
think that would really get a bit tight with only 32MB of memory.
If that becomes the major issue then it wouldn't be too hard to set up a
server and two clients, but if I understand your comments then I suspect
there are other issues.
Other problems are that clients connected to the samba daemon won't be
able to repair conflicts, so a conflict is pretty much deadly in such a
setup (and in Coda's optimistic model unavoidable).
So we'd be better off using a coda client on each PC workstation.
Also unlike samba and nfsd daemons, the Coda servers are stateful, they
remember which clients fetch a copy of what objects and send callbacks
if any of the files change. Every callback requires a bit of allocated
memory, with many files x many clients it does add up, but in your case
you'd only have two, maybe three, clients.
Our old Coda deployment ran on reasonably modest hardware, the Coda
testserver was a Pentium 90 with 64MB of memory and it didn't really
have much trouble, although it did have swap, rvm log, rvm data and
the file data (/vicepa) on separate spindles (4 scsi drives)
The main server group used to consist of something like PII 200Mhz with
128MB of memory, but again we spread swap, rvm log, rvm data and file
data across different disks.
Also, how should I structure this so that all the data is available to
both sides, even in the event of a VPN failure? (These boxes would be
pretty much on opposite sides of the globe, so I can't really be sure
the VPN will be available 100%.) This means that the client would have
to actively hoard all the data? Is that practical? Or should I use a
different approach?
It really depends on how many file objects you are talking about. If
each file is 1GB, then we're just talking about ~300 files and I would
not see any possible problem hoarding everything.
If each file is ~4KB then I don't think it is feasible (at the moment)
the client won't be able to keep all the metadata in memory and it will
basically bring the device to a virtual halt in a swap frenzy about
every 10 minutes during the hoard walk.
The files tend to be large, but not that large. I'll have to look at it.
Have you considered a setup that periodically mirrors or syncs both
sites with something like unison or rsync. I just think that if your
clients are going to be using a stateless filesystem to access the data
on the appliances, they would just suffer from the drawbacks of Coda's
weaker consistency model (no file locking, files becoming inaccessible
due to conflicts) without really benefitting from Coda's features
(persistent local disk cache, fast access to cached file data, writeback
logging and log optimizations, directory ACLs for access control).
I used to use unison to do something like this.
What I was hoping for is a real-time solution, as the two offices are
nearly 12 hours apart, plus workweeks are different due to cultural
differences, giving us only a few hours a week of overlap (or downtime,
depending on the perspective.)
What concerns me is the comment that Coda doesn't do file locking.
Maybe I misread something or just assumed that Coda will do remote file
locking.
In our situation, we work with "assemblies" where each assembly consists
of many separate files, which are opened concurrently either RO or RW,
and file locking across the network is essential to prevent corruption
of the entire assembly.
Do I read your comments correctly in that Coda doesn't do file locking
across the network?
Or would that just be the consequence of my proposed samba<->coda
setup? Would I get file locking if each PC was a Coda client?
--Yan
--
o__
,>/'_ o__
(_)\(_) ,>/'_ o__
Yan Seiner (_)\(_) ,>/'_ o__ o__
Certified Personal Trainer (_)\(_) ,>/'_ ,>/'_
Licensed Professional Engineer (_)\(_) (_)\(_)
Linux stuff has made big progress over the competition. When things sit and
don't start right away, we have a watch, and those poor guys have to settle for
an hourglass.