Hi Zheng, Here's the plan for the second map-reduce job -- http://pastebin.com/m59d5a84b I don't see compression anywhere.
Saurabh. On Fri, Aug 21, 2009 at 11:30 AM, Zheng Shao <[email protected]> wrote: > Hi Suarabh, > > Sorry for the delay on this. We are busy with the production this week. > > I don't think there is much difference in CLI queries and JDBC queries. > > Yes, this is what I am talking about. Since your query has 2 > map-reduce jobs, there will be two .xml files. > Can you show us the second one? Does the second one also contains > "<...>compressed<...>true<...>" in the section of FileSinkOperator? > > Zheng > > On Tue, Aug 18, 2009 at 3:21 AM, Saurabh Nanda<[email protected]> > wrote: > > Is this what you're talking about -- http://pastebin.ca/1533627 ? Seems > like > > compression is on. > > > > Is there any difference in how CLI queries and JDBC queries are treated? > > > > Saurabh. > > > > On Tue, Aug 18, 2009 at 11:19 AM, Zheng Shao <[email protected]> wrote: > >> > >> Hi Saurabh, > >> > >> So the compression flag is correct when the plan is generated. > >> When you run the query, you should see "plan = xxx.xml" in the log > >> file. Can you open that file (in HDFS) and see whether the compression > >> flag is on or not? > >> > >> Zheng > >> > >> On Mon, Aug 17, 2009 at 5:17 AM, Saurabh Nanda<[email protected]> > >> wrote: > >> > Hey Zheng, any clues as to what the bug is? Or what I'm doing wrong? I > >> > can > >> > do some more digging and logging if required. > >> > > >> > Saurabh. > >> > > >> > On Mon, Aug 17, 2009 at 1:28 PM, Saurabh Nanda < > [email protected]> > >> > wrote: > >> >> > >> >> Here's the log output: > >> >> > >> >> 2009-08-17 13:26:42,183 INFO parse.SemanticAnalyzer > >> >> (SemanticAnalyzer.java:genFileSinkPlan(2575)) - Created FileSink Plan > >> >> for > >> >> clause: insclause-0dest_path: > >> >> hdfs://master-hadoop/user/hive/warehouse/raw/dt=2009-04-07 row > schema: > >> >> {(_col0,_col0: string)(_col1,_col1: string)(_col2,_col2: > >> >> string)(_col3,_col3: string)(_col4,_col4: string)(_col5,_col5: > >> >> string)(_col6,_col6: string)(_col7,_col7: string)(_col8,_col8: > >> >> string)(_col9,_col9: string)(_col10,_col10: int)} . > >> >> HiveConf.ConfVars.COMPRESSRESULT=true > >> >> > >> >> Is the SemanticAnalyszer run more than once in the lifetime of a job? > >> >> Should I be looking for another log entry like this one? > >> >> > >> >> Saurabh. > >> >> > >> >> On Mon, Aug 17, 2009 at 1:26 PM, Saurabh Nanda < > [email protected]> > >> >> wrote: > >> >>> > >> >>> Strange. The compression configuration log entry was also info but I > >> >>> could see it in the task logs: > >> >>> > >> >>> LOG.info("Compression configuration is:" + isCompressed); > >> >>> > >> >>> Saurabh. > >> >>> > >> >>> On Mon, Aug 17, 2009 at 12:56 PM, Zheng Shao <[email protected]> > wrote: > >> >>>> > >> >>>> The default log level is WARN. Please change it to INFO. > >> >>>> > >> >>>> hive.root.logger=INFO,DRFA > >> >>>> > >> >>>> Of course you can also use LOG.warn() in your test code. > >> >>>> > >> >>>> Zheng > >> >>>> > >> >>> -- > >> >>> http://nandz.blogspot.com > >> >>> http://foodieforlife.blogspot.com > >> >> > >> >> > >> >> > >> >> -- > >> >> http://nandz.blogspot.com > >> >> http://foodieforlife.blogspot.com > >> > > >> > > >> > > >> > -- > >> > http://nandz.blogspot.com > >> > http://foodieforlife.blogspot.com > >> > > >> > >> > >> > >> -- > >> Yours, > >> Zheng > > > > > > > > -- > > http://nandz.blogspot.com > > http://foodieforlife.blogspot.com > > > > > > -- > Yours, > Zheng > -- http://nandz.blogspot.com http://foodieforlife.blogspot.com
