hi marcel, On Sun, Sep 25, 2011 at 3:40 PM, Marcel Bruch <[email protected]> wrote: > Hi, > > I'm looking for some advice whether Jackrabbit might be a good choice for my > problem. Any comments on this are greatly appreciated. > > > = Short description of the challenge = > > We've built a Eclipse based tool that analyzes java source files and stores > its analysis results in additional files. The workspace potentially has > hundreds of projects and each project may have up to a few thousands of > files. Say, there will be 200 projects and 1000 java source files per project > in a single workspace. Then, there will be 200*1000 = 200.000 files. > > On a full workspace build, all these 200k files have to be compiled (by the > IDE) and analyzed (by our tool) at once and the analysis results have to be > dumped to disk rather fast. > But the most common use case is that a single file is changed several times > per minute and thus gets frequently analyzed. > > At the moment, the analysis results are dumped on disk as plain json files; > one json file for each java class. Each json file is around 5 to 100kb in > size; some files grow up to several megabytes (<10mb), these files have a few > hundred JSON complex nodes (which might perfectly map to nodes in JCR). > > = Question = > > We would like to change the simple file system approach by a more > sophisticated approach and I wonder whether Jackrabbit may be a suitable > backend for this use case. Since we map all our data to JSON already, it > looks like Jackrabbit/JCR is a perfect fit for this but I can't say for sure. > > What's your suggestion? Is Jackrabbit capable to quickly load and store > json-like data - even if 200k files (nodes + their sub-nodes) have to be > updated very in very short time?
absolutely. if the data is reasonably structured/organized jackrabbit should be a perfect fit. i suggest to leverage the java package space hierarchy for organizing the data (i.e. org.apache.jackrabbit.core.TransientRepository -> /org/apache/jackrabbit/core/TransientRepository). for further data modeling recommondations see [0]. cheers stefan [0] http://wiki.apache.org/jackrabbit/DavidsModel > > > Thanks for your suggestions. I've you need more details on what operations > are performed or how data looks like, I would be glad to take your questions. > > Marcel > > -- > Eclipse Code Recommenders: > w www.eclipse.org/recommenders > tw www.twitter.com/marcelbruch > g+ www.gplus.to/marcelbruch > >
