On Sun, May 2, 2010 at 2:29 AM, Sachin Bochare <
[email protected]> wrote:

>    Hi,
>
> I applied index patch available at :
> https://issues.apache.org/jira/browse/HIVE-678
>
> However after applying the indexing patch, simple select statements are not
> showing any results. The "select *" is working but selecting a specific
> column is not working. I have pasted an example below which illustrates the
> problem.
>
> The same select is working without the patch on the same metastore_db. The
> only difference between working code and non-working code is the patch.
>
> I used 796926 version of the code. The patch attached in HIVE-678 was
> created on this version.
>
> Following example illustrates the problem:
>
> Example with patch code:
> -----------------------------
>
> =====================================
> hive> create table ourtest (empid int, firstname string, lastnamestring, 
> hoursworkedint)
> partitioned by(dt string, place string) clustered by (empid) sorted
> by(hoursworked) into 4 buckets row format delimited fields terminated by
> ',' stored as textfile;
> OK
> Time taken: 0.307 seconds
> hive> LOAD DATA LOCAL INPATH '/root/data/ourtest_data.csv' INTO
> TABLE ourtest PARTITION(dt='2010-02-27', place='Pune');
> Copying data from file:/root/data/ourtest_data.csv
> Loading data to table ourtest partition {dt=2010-02-27, place=Pune}
> OK
> Time taken: 0.753 seconds
> hive> select * from ourtest; ---> Select * is working fine.
> OK
> 0       firstname       lastname        0       2010-02-27      Pune
> 1       firstname1      lastname1       1       2010-02-27      Pune
> 2       firstname2      lastname2       2       2010-02-27      Pune
> 3       firstname3      lastname3       3       2010-02-27      Pune
> 4       firstname4      lastname4       4       2010-02-27      Pune
> 5       firstname5      lastname5       5       2010-02-27      Pune
> 6       firstname6      lastname6       6       2010-02-27      Pune
> 7       firstname7      lastname7       7       2010-02-27      Pune
> 8       firstname8      lastname8       8       2010-02-27      Pune
> 9       firstname9      lastname9       9       2010-02-27      Pune
> 10      firstname10     lastname10      10      2010-02-27      Pune
> Time taken: 0.106 seconds
> hive> select empid from ourtest; ---> Selecting specific column is not
> working.
> Total MapReduce jobs = 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201002091652_0170, Tracking URL =
> http://v-hadoop3.persistent.co.in:60030/jobdetails.jsp?jobid=job_201002091652_0170
> Kill Command = /root/hadoop-0.20.1/bin/../bin/hadoop job
> -Dmapred.job.tracker=v-hadoop3.persistent.co.in:30001 -kill
> job_201002091652_0170
> 2010-05-02 08:40:48,951 map = 0%,  reduce =0%
> 2010-05-02 08:40:58,044 map = 50%,  reduce =0%
> 2010-05-02 08:40:59,057 map = 100%,  reduce =0%
> 2010-05-02 08:41:02,067 map = 100%,  reduce =100%
> Ended Job = job_201002091652_0170
> OK
> Time taken: 15.494 seconds
> =====================================
>
>  Example without patch code:
> --------------------------------
> Example query is working after using without-patch code on the same
> metastore_db.
>
> =====================================
> r...@v-hadoop3<https://puneexchange.persistent.co.in/owa/UrlBlockedError.aspx>
> :~/ <https://puneexchange.persistent.co.in/owa/UrlBlockedError.aspx>sachin
> /Hive-796926-Patch<https://puneexchange.persistent.co.in/owa/UrlBlockedError.aspx>#
> ../Hive-796926/build/dist/bin/hive
> Hive history file=/tmp/root/hive_job_log_root_201005020928_924651644.txt
> hive> select empid from ourtest;
> Total MapReduce jobs = 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201002091652_0190, Tracking URL =
> http://v-hadoop3.persistent.co.in:60030/jobdetails.jsp?jobid=job_201002091652_0190
> Kill Command = /root/hadoop-0.20.1/bin/../bin/hadoop job
> -Dmapred.job.tracker=v-hadoop3.persistent.co.in:30001 -kill
> job_201002091652_0190
> 2010-05-02 09:29:04,733 map = 0%,  reduce =0%
> 2010-05-02 09:29:18,799 map = 100%,  reduce =0%
> 2010-05-02 09:29:21,823 map = 100%,  reduce =100%
> Ended Job = job_201002091652_0190
> OK
> 0
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> 10
> Time taken: 22.268 seconds
> =====================================
>
> Can anyone point to what can be the problem here? Which module is a suspect
> here?
>
> Regards,
>  Sachin
>
> DISCLAIMER ========== This e-mail may contain privileged and confidential
> information which is the property of Persistent Systems Ltd. It is intended
> only for the use of the individual or entity to which it is addressed. If
> you are not the intended recipient, you are not authorized to read, retain,
> copy, print, distribute or use this message. If you have received this
> communication in error, please notify the sender and delete all copies of
> this message. Persistent Systems Ltd. does not accept any liability for
> virus infected mails.
>

The comments for that issue seem to suggest the patch is not complete yet.
For reference 'select *' queries simply read that block data from hdfs so
they do not use map-reduce (and thus probably do not use any indexes either.

Edward

Reply via email to