embedded database cannot handle large-scale data, not very efficient I have about 1 billion records. these records should be passed through some modules. I mean a data exchange format similar to XML but more flexible and efficient.
On Sun, Nov 2, 2008 at 10:49 AM, lamfeeling <[EMAIL PROTECTED]> wrote: > Consider Embeded Database? Berkeley DB, written in C++, and have interface > for many languages. > > > > > > 在2008-11-02?10:15:22,"Zhou,?Yunqing"?<[EMAIL PROTECTED]>?写道: > >The?project?I?focused?on?has?many?modules?written?in?different?languages > >(several?modules?are?hadoop?jobs). > >So?I'd?like?to?utilize?a?common?record?based?data?file?format?for?data > >exchange. > >XML?is?not?efficient?for?appending?new?records. > >SequenceFile?seems?not?having?API?of?other?languages?except?Java. > >Protocol?Buffers'?hadoop?API?seems?under?development. > >any?recommendation?for?this? > > > >Thanks >