RE: tez session timesout?

2017-01-16 Thread Brotanek, Jan
Seems TeZ is spawning many processes and using all file descriptors, causing Unix to temporarily run out of resources. I suppose this may be the problem, but don't know why it doesn't happen when 2nd query is invoked. It always fails on 3rd query. Is there any settings which can prevent this

DateFunction

2017-01-16 Thread Mahender Sarangam
Hi, Is there any Date Function which returns Full Month Name for given time stamp.

Fwd: Support of Theta Join

2017-01-16 Thread Mahender Sarangam
Is there any support of Theta Join in Spark. We have a requirement to identify the country name based on Range of IP Address in a table. Forwarded Message Subject:Support of Theta Join Date: Thu, 12 Jan 2017 15:19:51 + From: Mahender Sarangam

Re: DateFunction

2017-01-16 Thread Jitendra Yadav
Ref: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions . int month(string date) Returns the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11. Does it fit in your requirement?. Thanks On

VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
Coming from DBMS background I tend to treat the columns in Hive similar to an RDBMS table. For example if a table created in Hive as Parquet I will use VARCHAR(30) for column that has been defined as VARCHAR(30) as source. If a column is defined as TEXT in RDBMS table I use STRING in Hive with a

Re: DateFunction

2017-01-16 Thread Devopam Mittra
hi Mahender, I don't know your version of Hive . Please try : date_format(curren_date,'M') regards Dev On Mon, Jan 16, 2017 at 6:56 PM, Jitendra Yadav wrote: > Ref: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF# >

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread sreebalineni .
How is that efficient storage wise because as far as I see it is in hdfs and storage is based on your block size. Am i missing something here? On Jan 16, 2017 9:07 PM, "Mich Talebzadeh" wrote: Coming from DBMS background I tend to treat the columns in Hive similar

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
thanks both. String has a max length of 2GB so in a MapReduce with a 128MB block size we are talking about 16 blocks. With VARCHAR(30) we are talking about 1 block. I have not really experimented with this, however, I assume a table of 100k rows with VARCHAR columns will have a smaller footprint

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
Thanks Elliot for the insight. Another issue that Spark does not support "CHAR" types. It supports VARCHAR. Often one uses Spark as well on these tables. This should not really matter. I tend to define CHA(N) to be VARCHAR(N) as the assumption is that the table ingested into Parquet say is

Re: File not found of TEZ libraries with tez.lib.uris configuration

2017-01-16 Thread Jörn Franke
Maybe the wrong configuration file is picked up? > On 17 Jan 2017, at 07:44, wenxing zheng wrote: > > Dear all, > > I met an issue in the TEZ configuration for HIVE, as from the HIVE logs file: > >> Caused by: java.io.FileNotFoundException: File does not exist: >>

Re: File not found of TEZ libraries with tez.lib.uris configuration

2017-01-16 Thread Jörn Franke
Sorry never mind my previous mail... in the stack it seems to look exactly for this file. Can you try to download the file? Can you check if these are all files needed? I think you need to extract the .tar.gz and point to the jars (check the Tez web site for the confit). > On 17 Jan 2017, at

File not found of TEZ libraries with tez.lib.uris configuration

2017-01-16 Thread wenxing zheng
Dear all, I met an issue in the TEZ configuration for HIVE, as from the HIVE logs file: > *Caused by: java.io.FileNotFoundException: File does not exist: > hdfs://hdfscluster/apps/tez-0.8.4/tez.tar.gz* > *at >

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
Sounds like VARCHAR and CHAR types were created for Hive to have ANSI SQL Compliance. Otherwise they seem to be practically the same as String types. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Elliot West
Internally it looks as though Hive simply represents CHAR/VARCHAR values using a Java String and so I would not expect a significant change in execution performance. The Hive JIRA suggests that these types were added to 'support for more SQL-compliant behavior, such as SQL string comparison

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Gopal Vijayaraghavan
> Sounds like VARCHAR and CHAR types were created for Hive to have ANSI SQL > Compliance. Otherwise they seem to be practically the same as String types. They are relatively identical in storage, except both are slower on the CPU in actual use (CHAR has additional padding code in the