Mark Chadwick
Tue, 01 Jul 2008 12:00:48 -0700
XML isn't particularly well suited for a Map/Reduce environment, however. Hierarchical data is very tough to partition out to mappers. On top of that, because the data will be partitioned in 64M blocks (by default), there's a very good chance that a random 64M chunk of a large XML file will not even be parsable (opening tags in one block, closing tags in another). On Tue, Jul 1, 2008 at 2:30 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: > This can work but you would need to write a custom loader to parse the > data: http://wiki.apache.org/pig/StorageFunction > > Olga > > > -----Original Message----- > > From: Kayla Jay [EMAIL PROTECTED] > > Sent: Tuesday, July 01, 2008 11:24 AM > > To: pig-user@incubator.apache.org > > Subject: Pig + xml ? > > > > Hi > > > > Can you use Pig with XML data files? If so, does anyone have > > any examples? > > I want to do something that would equate to an XPath query > > against the XML. > > > > Thanks. > > > > > > > > > > >