Re: question on create database

2015-04-02 Thread Chen Song
be in that directory. AFAIK > there's no way to prevent that. > > Alan. > > Chen Song > April 2, 2015 at 8:15 > I have a dumb question on DDL statement "create database" > > Say if I create a database > CREATE DATABASE abc > LOCATION '/my/preferre

question on create database

2015-04-02 Thread Chen Song
ferred/directory? I searched around but could not find a way to enforce this. -- Chen Song

Re: question on HIVE-5891

2014-11-18 Thread Chen Song
I haven't found a workaround yet. On Thu, Nov 13, 2014 at 11:25 AM, Stéphane Verlet wrote: > Chen > > Did you find a workarround ? Anybody else have a suggestion ? > > Thank you > > Stephane > > On Mon, Aug 4, 2014 at 9:00 AM, Chen Song wrote: > >> I

Re: mapreduce.job.queuename doesn't work with hiveserver2

2014-10-10 Thread Chen Song
2//url:1/db?mapreduce.job.queuename=q1 > > > > *From:* Chen Song [mailto:chen.song...@gmail.com] > *Sent:* Friday, October 10, 2014 2:38 AM > *To:* user > *Subject:* mapreduce.job.queuename doesn't work with hiveserver2 > > > > By setting mapreduce.job.queue

mapreduce.job.queuename doesn't work with hiveserver2

2014-10-09 Thread Chen Song
ere a way I can configure a specific queue for queries going through hiveserver2? -- Chen Song

Re: hive auto join conversion

2014-08-12 Thread Chen Song
hen On Wed, Jul 30, 2014 at 10:07 PM, Eugene Koifman wrote: > would manually rewriting the query from (T1 union all T2) LOJ S to > equivalent (T1 LOJ S) union all (T2 LOJ S) help work around this issue? > > > On Wed, Jul 30, 2014 at 6:19 PM, Chen Song wrote: > >> I tried t

question on HIVE-5891

2014-08-04 Thread Chen Song
I am using cdh5 distribution and It doesn't look like this jira https://issues.apache.org/jira/browse/HIVE-5891 is backported into cdh 5.1.0. Is there a workaround to modify the query that is subject to this problem? -- Chen Song

Re: hive auto join conversion

2014-07-30 Thread Chen Song
efault (see > https://issues.apache.org/jira/browse/HIVE-4042) > > Thanks, > Navis > > > 2014-07-31 10:04 GMT+09:00 Chen Song : > > I am using cdh5 with hive 0.12. We have some hive jobs migrated from hive >> 0.10 and they are written like below: >> >&g

hive auto join conversion

2014-07-30 Thread Chen Song
- MapJoin Followed by MapJoin So if one side of the table (big side) is a union of some tables and the other side is a small table, Hive would not be able to do a map join at all? Is that correct? If correct, what should I do to make the job backward compatible? -- Chen Song

php thrift client for hiveserver2

2014-05-05 Thread Chen Song
I am using CDH5 with hive 0.12.0. I dig a bit myself in internet but could not find sample php code to connect to hiveserver2. I know there are php files generated by thrift but they seem broken, as shown below. php -l /usr/lib/hive/lib/php/packages/hive_service/TCLIService.php PHP Parse error:

Re: the php client seems to be broken

2014-04-23 Thread Chen Song
; > > > On 03/17/2014 11:45 AM, Jeremy wrote: > > I am trying to use the php hive thift client and there is name space and > autoload issues. > Autoload is not generated and I have not found any instructions how to > properly generate it. > Also some files contain namespace; which is a syntax issue. > Is any one aware of these issues and know how to fix them? > > > > > -- Chen Song

Re: single MR stage for join and group by

2013-08-02 Thread Chen Song
mize.mapjoin.mapreduce. > > Thanks, > > Yin > > > On Thu, Aug 1, 2013 at 5:32 PM, Stephen Sprague wrote: > >> and what version of hive are you running your test on? i do believe - >> not certain - that hive 0.11 includes the optimization you seek. >> >&g

single MR stage for join and group by

2013-08-01 Thread Chen Song
Suppose we have 2 simple tables A id int value string B id When hive translates the following query select max(A.value), A.id from A join B on A.id = A.id group by A.id; It launches 2 stages, one for the join and one for the group by. My understanding is that if the join key set is a sub set

Re: Hive producing difference outputs

2012-12-19 Thread Chen Song
resent as no one has replied yet :) On Thu, Nov 15, 2012 at 11:18 AM, Chen Song wrote: > Hi Folks > > We are getting inconsistent output when running some semantically same > Hive queries. We are using *CDH3u3* with *Hive 0.7.1.* > > The query I am running performs a ma

Re: map side join with group by

2012-12-13 Thread Chen Song
d to do. > > If you have such case (may be if you think that it will improve > performance), please feel free to raise a jira and get it reviewed. if its > valid I think people will provide more ideas > > > On Fri, Dec 14, 2012 at 12:42 AM, Chen Song wrote: > >> Niti

Re: map side join with group by

2012-12-13 Thread Chen Song
13, 2012 at 11:54 PM, Chen Song wrote: > >> Understood that fact that it is impossible in the same MR job if both >> join and group by are gonna happen in the reduce phase (because the join >> keys and group by keys are different). But for map side join, the joins >> would

Re: map side join with group by

2012-12-13 Thread Chen Song
y are two different jobs > > > On Thu, Dec 13, 2012 at 8:26 PM, Chen Song wrote: > >> Yeah, my abridged version of query might be a little broken but my point >> is that when a query has a map join and group by, even in its simplified >> incarnation, it will launch two jobs

Re: map side join with group by

2012-12-13 Thread Chen Song
be applied in a separate MR job. >> >> This is just my understanding, the full proof answer would lie in >> checking out the explain plans and the Semantic Analyzer code. >> >> And for completeness, there is a conditional task (starting Hive 0.7) >> that will conv

map side join with group by

2012-12-12 Thread Chen Song
e query plan is that all 2nd job mapper do is taking the 1st job's mapper output. -- Chen Song

Re: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

2012-10-29 Thread Chen Song
is >> intended solely for the attention and use of the named addressee and may be >> confidential. If you are not the intended recipient, you are reminded that >> the information remains the property of the sender. You must not use, >> disclose, distribute, copy, print or rely

Re: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

2012-10-29 Thread Chen Song
sender. You must not use, > disclose, distribute, copy, print or rely on this e-mail. If you have > received this message in error, please contact the sender immediately and > irrevocably delete this message and any copies. > > -- Chen Song

Re: general question on hive client log

2012-10-25 Thread Chen Song
. > if you want the cwiki page, here it is > > https://cwiki.apache.org/Hive/languagemanual-cli.html#LanguageManualCli-Logging > > > > > On Thu, Oct 25, 2012 at 1:15 AM, Chen Song wrote: > > I have searched online but could not find a comprehensive introduction on &

Re: Error in semantic analysis: Unable to fetch table

2012-10-19 Thread Chen Song
Wiley kwi...@keithwiley.com keithwiley.com > music.keithwiley.com > > "Luminous beings are we, not this crude matter." >-- Yoda > > > > -- Chen Song

Re: can i define an udf which can process more than one argument?

2012-10-19 Thread Chen Song
evaluate(String ip) { > > } > > can i define a udf like COALESCE(T v1, T v2, …) or if(boolean > testCondition, T valueTrue, T valueFalseOrNull)? > > ------ > Chris Gong > -- Chen Song

Re: ERROR: Hive subquery showing

2012-09-27 Thread Chen Song
r.java:156) >> >> Regards >> Yogesh Kumar >> >> -- >> Subject: Re: ERROR: Hive subquery showing >> To: user@hive.apache.org >> From: bejoy...@yahoo.com >> Date: Thu, 27 Sep 2012 19:48:25 + >> >> Hi yogesh >> &

Re: ERROR: Hive subquery showing

2012-09-27 Thread Chen Song
x27;name': (possible column names are: _col0) > > Please help and suggest why it is so, and what would be the query; > > > Thanks & regards > Yogesh Kumar > > > > > > -- Chen Song

Re: Re: size of RCFile in hive

2012-09-27 Thread Chen Song
You can force reduce phase by adding distribute by or order by clause after your select query. On Thu, Sep 27, 2012 at 2:03 PM, 王锋 wrote: > but it's map only job > > > At 2012-09-27 05:39:39,"Chen Song" wrote: > > As far as I know, the number of files e

Re: size of RCFile in hive

2012-09-26 Thread Chen Song
le select . > > > the settings: > hive.merge.mapfiles=true > hive.merge.mapredfiles=true > hive.merge.size.per.task=64000 > hive.merge.size.smallfiles.avgsize=8000 > didn't work. > > > who could tell me how to solve it? -- Chen Song

Re: How can I get the constant value from the ObjectInspector in the UDF

2012-09-26 Thread Chen Song
ime > the real class of the 2nd parameter is WritableIntObjectInspector. I can > get the type, but how I can get the real value of it? > 6) This is kind of ConstantsObjectInspector, should be able to give the > value to me, as it already knows the type is int. What how? > 7) I don't want to try to get the value at the evaluate stage. Can I get > this value at the initialize stage? > > Thanks > > Yong > > > > > -- > Chen Song > > > -- Chen Song

Re: How can I get the constant value from the ObjectInspector in the UDF

2012-09-26 Thread Chen Song
to get the value at the evaluate stage. Can I get > this value at the initialize stage? > > Thanks > > Yong > -- Chen Song

Question on escaped characters in Hive Shell

2012-03-01 Thread Chen Song
Hi All I have a question on quoted hive query when executed with 'hive -e'. The command I ran looks like: > hive -e "select regexp_extract(col1, '\\d+') from A where col2='some value' > limit 5" When the query get passed into hive, it is interpreted as > select regexp_extract(col1, \d+') from

Re: question on hive multiple insert

2012-01-20 Thread Chen Song
Please disregard my previous email. I figured out that the correct syntax should be FROM a JOIN b on a.col1 = b.col1 INSERT a.col1, a.col2 INSERT ... Thanks Chen From: Chen Song To: hive user list Sent: Friday, January 20, 2012 10:47 AM Subject: question

question on hive multiple insert

2012-01-20 Thread Chen Song
Hi I am reading on Hive's multiple insert syntax manual and wondering if it is possible to utilize join in any individual insert. e.g.,  FROM a INSERT a.col1, a.col2 join b on a.col1 = b.col1 INSERT ... Apparently, Hive doesn't like this query and give syntax error. In other words, my question

Re: pass entire row as parameter in hive UDF

2011-11-01 Thread Chen Song
Can this be only used in regular select statement or also as arguments to UDF? In this case, how shall I define my UDF/GenericUDF method signature to accept column in this form? Will Hive automatically expand the column list and pass them to customized UDF? If there is any example, that would

pass entire row as parameter in hive UDF

2011-10-31 Thread Chen Song
Hi All In HIVE, I would like to write a UDF that accepts a sequence of parameters. Due to that the number of parameters is large and the particular function that I am writing is specific to a set of tables (joined in some way in the SQL), I am wondering if there is a way to pass the entire row