Bill, Would you know if that is expected or a bug?
On Wed, Jun 6, 2012 at 5:56 PM, Bill Graham <billgra...@gmail.com> wrote: > One thing to be aware of when accessing the pig.script option is that AFAIK > there's a limit to how large the script can be, after which the rest would > be truncated. > > > On Wed, Jun 6, 2012 at 5:44 PM, Prashant Kommireddi <prash1...@gmail.com > >wrote: > > > I completely agree that's an option. But IMHO being able to do that > upfront > > would be a nice feature, adding cron is just an additional process we > could > > avoid if possible. > > > > On Wed, Jun 6, 2012 at 5:39 PM, Dmitriy Ryaboy <dvrya...@gmail.com> > wrote: > > > > > You can write a nightly cron that runs the JobHistoryLoader job and > > > stores parsed scripts to hdfs... > > > > > > D > > > > > > On Wed, Jun 6, 2012 at 5:16 PM, Prashant Kommireddi < > prash1...@gmail.com > > > > > > wrote: > > > > I think that would be more of a post-process vs having Pig write the > > same > > > > to a HDFS location. That would avoid having to parse it from job.xml. > > > > > > > > On Wed, Jun 6, 2012 at 4:19 PM, Daniel Dai <da...@hortonworks.com> > > > wrote: > > > > > > > >> One existing solution is "pig.script" entry inside job.xml, it is > the > > > >> serialized Pig script. JobHistoryLoader can load job.xml files and > > grab > > > >> those entries. Does that solve your problem? > > > >> > > > >> Daniel > > > >> > > > >> On Wed, Jun 6, 2012 at 3:52 PM, Prashant Kommireddi < > > > prash1...@gmail.com > > > >> >wrote: > > > >> > > > >> > Hi All, > > > >> > > > > >> > What do you guys think about adding a feature to be able to > persist > > > the > > > >> > script (file or cache in case of grunt) on HDFS or locally based > on > > an > > > >> > admin setting (pig.properties). This will help infrastructure/ops > > > teams > > > >> > analyze nature of Pig scripts and be able to make certain > decisions > > > based > > > >> > on it (optimizing data storage based on access patterns etc). This > > is > > > >> > actually something we want to do but the challenge is there is no > > > central > > > >> > place where we can track user scripts. > > > >> > > > > >> > It could be a config param "pig.persist.script=/pig/". The script > > > could > > > >> be > > > >> > stored with a configurable name -> ${mapred.job.name}+${user.name > > > >> > }+timestamp" > > > >> > either on HDFS or local based on the configuration setting. > > > >> > > > > >> > Thanks, > > > >> > Prashant > > > >> > > > > >> > > > > > > > > > -- > *Note that I'm no longer using my Yahoo! email address. Please email me at > billgra...@gmail.com going forward.* >