Thanks for the offer. I thinking of a situation were I don't know the schema ahead of time. For example, a JMS queue that I simply want to store the XML somewhere. And let some other program parse it. This is a thought experiment.
On Sun, Jun 17, 2012 at 1:06 PM, Jim Klucar <[email protected]> wrote: > David, > > Can you give a taste of the schema of the XML? With that we may be > able to help break the XML file up into keys and help create an index > for it. IMHO that's the power you would get from accumulo. If you just > want it as one big lump, and don't need to search it or only retrieve > portions of the file, then putting it in accumulo is just adding > overhead to hdfs. > > > Sent from my iPhone > > On Jun 17, 2012, at 9:54 AM, David Medinets <[email protected]> wrote: > >> Some of the XML records that I work with are over 50M. I was hoping to >> store them inside of Accumulo instead of the text-based HDFS XML super >> file currently being used. However, since they are so large I can't >> create a Value object without running out of memory. Storing values >> this large may simply be using the wrong tool, please let me know.
