Hey Russ you should check out the SchemaAwarePigLoader or whatever it was I wound up calling it. Dumps a header file next to the data so you can just cat everything together. Should be in 0.7 piggybank.
On Mon, Jun 14, 2010 at 5:01 PM, Russell Jurney <[email protected]>wrote: > When I need reports, I do: > > echo "My\tcolumn\tnames\n" > report.tsv > hdfs -cat my/pig/output/* >> report.tsv > > If you need something more elaborate, you could use something like > http://search.cpan.org/dist/Spreadsheet-WriteExcel/ or simply load your > TSV > into a database with a script after your pig job finishes, and use any of > the database reporting tools. > > MySQL (with the Infobright engine if you have bigger data output) and > something like Pentaho would work: http://www.pentaho.com/ > Tableau is really nice, and can load smaller TSV directly, but is Windows > only and a bit pricey. http://www.tableausoftware.com/ > > Russ > > On Mon, Jun 14, 2010 at 4:24 PM, elein <[email protected]> wrote: > > > > > Does there exist any reporting tools that can run on top of > > pig or using pig? Or does everyone load TSV results in some type of > excel. > > > > I will need to create reports with labels and sequential pig queries > > and any fancy display stuff I can send out with email. > > > > > > elein > > [email protected] > > > > > > > > > > >
