Just an FYI if your interested. xmlsh (www.xmlsh.org) can do this pretty much out-of-the-box including the posting to marklogic (via the marklogic extension module). CSV files can be turned directly into XML with "csv2xml" with the degenerate case of 1 line being one XML element, or you can read the lines one by one and split them into fields via "read" and create XML and post it to a ML server all in one scripting language, and pretty efficiently as it all runs within one JVM (no subprocess creation), about as efficiently as a "native" java app.
-David Lee -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Jason Hunter Sent: Wednesday, May 20, 2009 6:14 PM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] ingest log files? Hi Jakob, I'd recommend using a Java program (or .NET, depends on your preference) to parse the file and construct basic XML out of each line, and pass that XML to MarkLogic via XCC. -jh- On May 20, 2009, at 1:38 AM, Jakob Fix wrote: > Hello, > > here's another question from a MarkLogic newbie. I was thinking of > indexing Apache log files (or any other line-based log files, for this > matter) for later exploitation. I was thinking of feeding this XML > to some kind of dashboard application which would be able to create > nice graphs and buckets etc. > > Is it possible and does it make sense at all to create a CPF pipeline > that will parse and enrich each line to create an XML representation? > Is something like this already worked on? Another approach? > > -- > cheers, > Jakob. _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
