Re: Implement in clause with or clause

2010-08-04 Thread Zheng Shao
There are no risks, but it will be slower especially when the list after in is very long. Zheng 2010/8/3 我很快乐 896923...@qq.com: Thank you for your reply. Because my company reuire we use 0.4.1 version, so I could't upgrade the version.  Could you tell me there are which risks if I use the OR

why I can't reply email

2010-08-04 Thread 我很快乐
I can send email to hive-user@hadoop.apache.org, but after other people reply my email, I can't reply the people's email and I recieve below message: host mx1.eu.apache.org[192.87.106.230] said: 552 spam score (14.4) exceeded threshold (in reply to end of DATA command) . Could anybody can

why is slow when use OR clause instead of IN clause

2010-08-04 Thread lei liu
Because my company reuire we use 0.4.1 version, the version don't support IN clause. I want to use the OR clause(example:where id=1 or id=2 or id=3) to implement the IN clause(example: id in(1,2,3) ). I know it will be slower especially when the list after in is very long. Could anybody can

hive with hadoop 0.21.0

2010-08-04 Thread bharath v
Hi all, Has anyone tried configuring Hive with Hadoop 0.21.0 . We want to use some features of 0.21.0 and configure Hive with it. If anybody there already tried this out ..may be you can share your experiences. Hoping for a response. Thanks Bharath.V 4th year Undergraduate, IIIT Hyderabad.

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread Edward Capriolo
On Wed, Aug 4, 2010 at 6:10 AM, lei liu liulei...@gmail.com wrote: Because my company reuire we use 0.4.1 version, the version don't support IN clause. I want to  use the OR clause(example:where id=1 or id=2 or id=3) to implement the IN clause(example: id in(1,2,3) ).  I know it will be slower

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread Mark Tozzi
I haven't looked at the code, but I assume the query parser would sort the 'in' terms and then do a binary search lookup into them for each row, while the 'or' terms don't have that kind of obvious relationship and are probably tested in sequence. This would give the in O(log N) performance

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread lei liu
Hello Edward Capriolo, Thank you for your reply. Are you sure that if you string enough 'or' together (say 8000) the query parser which uses java beans serialization will OOM? How many memory you assign to hive? 2010/8/4 Edward Capriolo edlinuxg...@gmail.com On Wed, Aug 4, 2010 at 6:10 AM, lei

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread Edward Capriolo
On Wed, Aug 4, 2010 at 12:15 PM, lei liu liulei...@gmail.com wrote: Hello Edward Capriolo, Thank you for your reply. Are you sure that if you string enough 'or' together (say 8000) the query parser which uses java beans serialization will OOM? How many memory you assign to hive? 2010/8/4

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread lei liu
Now I assign 100M memory to hive, you consider that can support how many 'OR' string? 2010/8/5 Edward Capriolo edlinuxg...@gmail.com On Wed, Aug 4, 2010 at 12:15 PM, lei liu liulei...@gmail.com wrote: Hello Edward Capriolo, Thank you for your reply. Are you sure that if you string enough

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread Ning Zhang
Edward, did you have HIVE-543 patch merged in your Hive? That patch revolves an issue of OOM in the hive client side. On Aug 4, 2010, at 9:22 AM, Edward Capriolo wrote: On Wed, Aug 4, 2010 at 12:15 PM, lei liu liulei...@gmail.com wrote: Hello Edward Capriolo, Thank you for your reply. Are

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread Edward Capriolo
On Wed, Aug 4, 2010 at 12:50 PM, Ning Zhang nzh...@facebook.com wrote: Edward, did you have HIVE-543 patch merged in your Hive? That patch revolves an issue of OOM in the hive client side. On Aug 4, 2010, at 9:22 AM, Edward Capriolo wrote: On Wed, Aug 4, 2010 at 12:15 PM, lei liu

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread Ning Zhang
Currently an expression tree (series of ORs in this case) is not collapsed to one operator or any other optimizations. It would be great to have this optimization rule to convert an OR operator tree to one IN operator. Would you be able to file a JIRA and contribute a patch? On Aug 4, 2010, at

RE: how to prevent user from dropping table created by another user

2010-08-04 Thread Bakshi, Ankita
Thanks Carl for the pointers. Good news is - user can recover from the failure by doing following steps: 1. For tables without partition, it is as simple as creating table definition again. 2. For tables with partition, it will involve creating table definition followed by creating partitions.

Re: why is slow when use OR clause instead of IN clause

2010-08-04 Thread Ning Zhang
I tested (1000 disjunctions) and it was extremely slow but no OOM. The issue seems to be the fact that we serialize the plan by writing to HDFS file directly. We probably should cache it locally and then write it to HDFS. On Aug 4, 2010, at 10:23 AM, Edward Capriolo wrote: On Wed, Aug 4,

Re: how to prevent user from dropping table created by another user

2010-08-04 Thread John Sichi
Slight clarification: the no_drop support Siying is adding will apply to all users (even the creator of the table). The analogy is as follows: no_drop mode is like the safety catch on a gun (it prevents the gun from being fired by anyone, even the person holding it, until explicitly taken

Re: hive with hadoop 0.21.0

2010-08-04 Thread Jay Booth
Believe it or not I'm pretty sure that literally nobody has tried to configure Hive with 21.0 yet -- you're probably best off sticking with 20.2. On Wed, Aug 4, 2010 at 9:41 AM, bharath v bharathvissapragada1...@gmail.com wrote: Hi all, Has anyone tried configuring Hive with Hadoop 0.21.0 . We

Re: how to prevent user from dropping table created by another user

2010-08-04 Thread John Sichi
The idea is that in a large shared cluster, you can use this to protect important tables from accidents. This includes protection from bugs in non-human agents such as automated retention/cleanup processes, which are likely to run as an account with full privileges. For tables which are

Re: How to mount/proxy a db table in hive

2010-08-04 Thread John Sichi
Either the handler would need to provide its own InputFormat and Split classes wrapping the ones from DBInputFormat (following the example from existing storage handlers such as HBase, where HBaseSplit extends FileSplit and wraps an underlying TableSplit), or we would need to finally clean up

Re: why I can't reply email

2010-08-04 Thread Bill Graham
Try sending your email as text and not html, if you're not already. Others have also had issues with apache lists with html emails getting flagged as spam more easily. On Wed, Aug 4, 2010 at 3:30 PM, Todd Lee ronnietodd...@gmail.com wrote: maybe qq.com got blacklisted :) T 2010/8/4 我很快乐

Hive JDBC and client encoding

2010-08-04 Thread 김영우
Hi, My hive server(trunk) is running on RHEL5. and Server's default encoding is UTF8. also log files in HDFS are encoded in UTF8. I have no problem on CLI and Linux based clients. it just works fine. I can see the Korean characters. No broken characters. I'm developing a java app using Hive