Cool, thanks! On Mon, Jun 14, 2010 at 6:55 PM, Dmitriy Ryaboy <[email protected]> wrote:
> Hey Russ you should check out the SchemaAwarePigLoader or whatever it was I > wound up calling it. > Dumps a header file next to the data so you can just cat everything > together. Should be in 0.7 piggybank. > > On Mon, Jun 14, 2010 at 5:01 PM, Russell Jurney <[email protected] > >wrote: > > > When I need reports, I do: > > > > echo "My\tcolumn\tnames\n" > report.tsv > > hdfs -cat my/pig/output/* >> report.tsv > > > > If you need something more elaborate, you could use something like > > http://search.cpan.org/dist/Spreadsheet-WriteExcel/ or simply load your > > TSV > > into a database with a script after your pig job finishes, and use any of > > the database reporting tools. > > > > MySQL (with the Infobright engine if you have bigger data output) and > > something like Pentaho would work: http://www.pentaho.com/ > > Tableau is really nice, and can load smaller TSV directly, but is Windows > > only and a bit pricey. http://www.tableausoftware.com/ > > > > Russ > > > > On Mon, Jun 14, 2010 at 4:24 PM, elein <[email protected]> wrote: > > > > > > > > Does there exist any reporting tools that can run on top of > > > pig or using pig? Or does everyone load TSV results in some type of > > excel. > > > > > > I will need to create reports with labels and sequential pig queries > > > and any fancy display stuff I can send out with email. > > > > > > > > > elein > > > [email protected] > > > > > > > > > > > > > > > > > >
