Hi Vaibhav, Did you get time to look into the issue of query-based compaction not working? Meanwhile, I have submitted a patch ( https://jira.apache.org/jira/browse/HIVE-21280) for the same. Can you please help me review it?
Thanks, Aditya On Tue, Feb 19, 2019 at 11:30 AM Vaibhav Gumashta <vgumas...@hortonworks.com> wrote: > The approach is similar, but it is not identical. Let me go over the query > based compaction codepath to see if I spot this bug there. > > > > Thanks, > > --Vaibhav > > > > *From: *Aditya Shah <adityashah3...@gmail.com> > *Date: *Saturday, February 16, 2019 at 3:44 AM > *To: *Vaibhav Gumashta <vgumas...@hortonworks.com> > *Cc: *"dev@hive.apache.org" <dev@hive.apache.org>, Eugene Koifman < > ekoif...@hortonworks.com>, Gopal Vijayaraghavan <go...@hortonworks.com> > *Subject: *Re: Null pointer exception on running compaction against an MM > table > > > > [image: mage removed by sender.] > > Hi, > > > > Thanks for the reply, have opened a JIRA (HIVE-21280) for the same and > will upload a patch soon. But I further had doubts on the new query based > compactor for full CRUD tables that has gone into master in HIVE-20699. > Does major compaction work there using query based compactor similar to the > one for MM table, because I expect the same problem to exist there as well? > > > > Aditya > > > > > > On Sat, Feb 16, 2019 at 2:34 AM Vaibhav Gumashta < > vgumas...@hortonworks.com> wrote: > > Aditya, > > > > Thanks for reporting this. Would you like to create a jira for this ( > https://issues.apache.org/jira/projects/HIVE)? Additionally, if you would > like to work on a fix, I’m happy to help in reviewing. > > > > --Vaibhav > > > > *From: *Aditya Shah <adityashah3...@gmail.com> > *Date: *Friday, February 15, 2019 at 2:05 AM > *To: *"dev@hive.apache.org" <dev@hive.apache.org> > *Cc: *Eugene Koifman <ekoif...@hortonworks.com>, Vaibhav Gumashta < > vgumas...@hortonworks.com>, Gopal Vijayaraghavan <go...@hortonworks.com> > *Subject: *Null pointer exception on running compaction against an MM > table > > > > *Error! Filename not specified.* > > Hi, > > > > I was trying to run compaction on MM table but got a null pointer > exception while getting HDFS session path. The error seemed to me that > session state was not started for this queries. Am I missing something > here? I do think session state needs to be started for each of the queries > (insert into temp table etc) running for compaction (I'm also doubtful for > statsupdater thread's queries) on HMS. Some details are as follows: > > > > Env./Versions: Using Hive-3.1.1 (rel/release-3.1.1) > > > > Steps to reproduce: > > 1) Using beeline with HS2 and HMS > > 2) create an MM table > > 3) Insert a few values in the table > > 4) alter table mm_table compact 'major' and wait; > > Stack trace on HMS: > > > > compactor.Worker: Caught exception while trying to compact > id:8,dbname:default,tableName:acid_mm_orc,partName:null,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0. > Marking failed to avoid repeated failures, java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create > temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, > `b` string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH SERDEPROPERTIES ( > 'serialization.format'='1')STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION > 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base' > TBLPROPERTIES ('transactional'='false') > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:373) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:241) > at > org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run > create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` > int, `b` string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH SERDEPROPERTIES ( > 'serialization.format'='1')STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION > 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base' > TBLPROPERTIES ('transactional'='false') > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:525) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:365) > ... 2 more > Caused by: java.lang.NullPointerException: Non-local session path expected > to be non-null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:815) > at org.apache.hadoop.hive.ql.Context.<init>(Context.java:309) > at org.apache.hadoop.hive.ql.Context.<init>(Context.java:295) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:591) > at > org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1684) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1807) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1556) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:522) > ... 3 more > > > > Observations: 1) SessionState.start() initializes paths, hivehist etc. > > 2) SessionState is not started in setupSessionState() in > runMMCompaction(). (There is also a comment by Sergey in the code regarding > the same) > > 3) Even after making it start the session state it further fails in > running a Teztask for insert overwrite on temp table with the contents of > the original table. > > 4) The cause for 3) is Tezsession state is not able to initialize due to > Illegal Argument exception being thrown at the time of setting up caller > context in Tez task due to caller id being empty > > 5) Reason for 4) is queryid is an empty string for such queries. > > 6) A possible solution for 5) Building querystate with queryid in > runOnDriver() in DriverUtils.java > > > > Do let me know if you need some more information for the same. > > > > Thanks and Regards, > > *Aditya Shah* > > 5th Year > > M.Sc.(Hons.) Mathematics & B.E.(Hons.) Computer Science and Engineering > > *▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄**▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄**▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄* > > *Birla Institute of Technology & Science, Pilani* > Vidhya Vihar > Pilani 333 031(Raj.), India > > Phone:- +91 7689996342 > > BITS Mail <https://mail.google.com/mail/u/0/#inbox> > > > > > > > -- > > *Aditya Shah* > > 4th Year > > M.Sc.(Hons.) Mathematics & B.E.(Hons.) Computer Science and Engineering > > *▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄**▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄**▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄* > > *Birla Institute of Technology & Science, Pilani* > Vidhya Vihar > Pilani 333 031(Raj.), India > > Phone:- *Error! Filename not specified.*+91 7689996342 > > BITS Mail <https://mail.google.com/mail/u/0/#inbox> > -- *Aditya Shah* 5th Year M.Sc.(Hons.) Mathematics & B.E.(Hons.) Computer Science and Engineering *▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄**▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄**▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄* *Birla Institute of Technology & Science, Pilani*Vidhya Vihar Pilani 333 031(Raj.), India Phone:- +91 7689996342 BITS Mail <https://mail.google.com/mail/u/0/#inbox>