David E Jones wrote:
> 
> This is an interesting overview and while I'm not sure why I hadn't
> thought along these lines before, at least it's through my thick skull
> now...
> 
> I asked Adam about how this would deploy on multiple servers with the
> stuff in the filesystem versus the database, and I think what you've
> written Ean is the answer.
> 
> Why not treat a source repo (either plain SVN or something more exotic
> like GIT) like the database? Each app server would read from and write
> to the source repo just like it would a database record. If SVN or GIT
> support 2-phase commits we could probably even do write operations in
> the a transaction that includes connections to both data stores.
> 
> For performance reasons you'd want to cache content from the source repo
> just like you would content from a relational database. If it's really
> too terribly slow even doing that (ie reading directly from the repo and
> caching) you could cache it locally in the app server's file system,
> though it would probably be best to never write directly to the local
> filesystem and you'd want some sort of timeout or other logic to
> invalidate the file system cache just like you'd do with the in memory
> cache (actually UtilCache supports this sort of thing, though now with
> straight files in the filesystem, just a sort of mini-database for local
> filesystem caching of data).
> 
> Anyway, is this something you guys have considered for WebSlinger?

I've got a commons-vfs filesystem implementation that uses git
plumbing to store content.  Every single mutation causes a new 'tree'
hash to be created in git.  It uses jgit to do this.  However, we
don't currently use it, it was more of a quick test.  One major
problem with jgit is that it reads the entire file into memory, which
will not work with large files.

I have not tested whether this interoperates with other git porcelain.

However, all that is moot.  GIT is not a shared-write system.  Each
instance is completely local.  You have your own copy of the repo, per
install.  You mutate it however.  Then either you push to another
machine/repo, or the other machine pulls from you.  This could be made
to work, doing some kind of anonymous ssh pulse thing, but it'd be a
heavy system integration, which ofbiz tends not to do.

> For the OFBiz Content side of things you could pretty easily have a
> DataResourceType for data in a source repo (ie instead of LOCAL_FILE
> something like REPOSITORY_FILE). On the DataResource entity the
> objectInfo field would have the URL/location of the resource (ie like
> the SVN/HTTP URL), and we could add a field like "revisionNumber" to
> specify which revision we want or null to get the head revision (I was
> thinking we could use the existing ContentRevision/Item entities for
> this, but looking at them it seems they wouldn't work so well and are
> really meant for a revision control built on top of the Content and
> DataResource entities, and not one that would describe revision
> information pointed to by them). The "revisionNumber" could also go on
> the Content entity so that we could have multiple Content records with
> different revision numbers pointing to the same DataResource records and
> reduce how many DataResource records we would require. That would also
> better fit how Content and DataResource are meant to work together, but
> on the other hand might be somewhat confusing.

No, no, you can't use a revisionNumber.  They don't exist.
Distributed systems change that completely.

> Thoughts anyone?
> 
> Oh, one more thing... I know there are some Java libraries for SVN, and
> there probably are some for GIT... has anyone played with these?

I've look at the documentation for svn/java; I've actually used
jgit(however, it's been a few years).

Reply via email to