On Sun, May 2, 2010 at 2:29 AM, Sachin Bochare < [email protected]> wrote:
> Hi, > > I applied index patch available at : > https://issues.apache.org/jira/browse/HIVE-678 > > However after applying the indexing patch, simple select statements are not > showing any results. The "select *" is working but selecting a specific > column is not working. I have pasted an example below which illustrates the > problem. > > The same select is working without the patch on the same metastore_db. The > only difference between working code and non-working code is the patch. > > I used 796926 version of the code. The patch attached in HIVE-678 was > created on this version. > > Following example illustrates the problem: > > Example with patch code: > ----------------------------- > > ===================================== > hive> create table ourtest (empid int, firstname string, lastnamestring, > hoursworkedint) > partitioned by(dt string, place string) clustered by (empid) sorted > by(hoursworked) into 4 buckets row format delimited fields terminated by > ',' stored as textfile; > OK > Time taken: 0.307 seconds > hive> LOAD DATA LOCAL INPATH '/root/data/ourtest_data.csv' INTO > TABLE ourtest PARTITION(dt='2010-02-27', place='Pune'); > Copying data from file:/root/data/ourtest_data.csv > Loading data to table ourtest partition {dt=2010-02-27, place=Pune} > OK > Time taken: 0.753 seconds > hive> select * from ourtest; ---> Select * is working fine. > OK > 0 firstname lastname 0 2010-02-27 Pune > 1 firstname1 lastname1 1 2010-02-27 Pune > 2 firstname2 lastname2 2 2010-02-27 Pune > 3 firstname3 lastname3 3 2010-02-27 Pune > 4 firstname4 lastname4 4 2010-02-27 Pune > 5 firstname5 lastname5 5 2010-02-27 Pune > 6 firstname6 lastname6 6 2010-02-27 Pune > 7 firstname7 lastname7 7 2010-02-27 Pune > 8 firstname8 lastname8 8 2010-02-27 Pune > 9 firstname9 lastname9 9 2010-02-27 Pune > 10 firstname10 lastname10 10 2010-02-27 Pune > Time taken: 0.106 seconds > hive> select empid from ourtest; ---> Selecting specific column is not > working. > Total MapReduce jobs = 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_201002091652_0170, Tracking URL = > http://v-hadoop3.persistent.co.in:60030/jobdetails.jsp?jobid=job_201002091652_0170 > Kill Command = /root/hadoop-0.20.1/bin/../bin/hadoop job > -Dmapred.job.tracker=v-hadoop3.persistent.co.in:30001 -kill > job_201002091652_0170 > 2010-05-02 08:40:48,951 map = 0%, reduce =0% > 2010-05-02 08:40:58,044 map = 50%, reduce =0% > 2010-05-02 08:40:59,057 map = 100%, reduce =0% > 2010-05-02 08:41:02,067 map = 100%, reduce =100% > Ended Job = job_201002091652_0170 > OK > Time taken: 15.494 seconds > ===================================== > > Example without patch code: > -------------------------------- > Example query is working after using without-patch code on the same > metastore_db. > > ===================================== > r...@v-hadoop3<https://puneexchange.persistent.co.in/owa/UrlBlockedError.aspx> > :~/ <https://puneexchange.persistent.co.in/owa/UrlBlockedError.aspx>sachin > /Hive-796926-Patch<https://puneexchange.persistent.co.in/owa/UrlBlockedError.aspx># > ../Hive-796926/build/dist/bin/hive > Hive history file=/tmp/root/hive_job_log_root_201005020928_924651644.txt > hive> select empid from ourtest; > Total MapReduce jobs = 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_201002091652_0190, Tracking URL = > http://v-hadoop3.persistent.co.in:60030/jobdetails.jsp?jobid=job_201002091652_0190 > Kill Command = /root/hadoop-0.20.1/bin/../bin/hadoop job > -Dmapred.job.tracker=v-hadoop3.persistent.co.in:30001 -kill > job_201002091652_0190 > 2010-05-02 09:29:04,733 map = 0%, reduce =0% > 2010-05-02 09:29:18,799 map = 100%, reduce =0% > 2010-05-02 09:29:21,823 map = 100%, reduce =100% > Ended Job = job_201002091652_0190 > OK > 0 > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > 10 > Time taken: 22.268 seconds > ===================================== > > Can anyone point to what can be the problem here? Which module is a suspect > here? > > Regards, > Sachin > > DISCLAIMER ========== This e-mail may contain privileged and confidential > information which is the property of Persistent Systems Ltd. It is intended > only for the use of the individual or entity to which it is addressed. If > you are not the intended recipient, you are not authorized to read, retain, > copy, print, distribute or use this message. If you have received this > communication in error, please notify the sender and delete all copies of > this message. Persistent Systems Ltd. does not accept any liability for > virus infected mails. > The comments for that issue seem to suggest the patch is not complete yet. For reference 'select *' queries simply read that block data from hdfs so they do not use map-reduce (and thus probably do not use any indexes either. Edward
