ORC numberformatexcetion with type DECIMAL

2014-01-10 Thread Kristopher Kane
Hive .12 on Hadoop 2 I have a table with a mix of STRING and DECIMAL fields that is stored as ORC no compression or partitions. I wanted to create a copy of this table with CTAS, stored also as ORC. The job fails with NumberFormatException at the HiveDecimal class but I can't narrow it down the

Re: ORC numberformatexcetion with type DECIMAL

2014-01-10 Thread Kristopher Kane
, Kristopher Kane kkane.l...@gmail.comwrote: Hive .12 on Hadoop 2 I have a table with a mix of STRING and DECIMAL fields that is stored as ORC no compression or partitions. I wanted to create a copy of this table with CTAS, stored also as ORC. The job fails with NumberFormatException

Re: Hive Join Running Out of Memory

2014-07-20 Thread Kristopher Kane
Clay, Keep in mind that setting this to false in the global hive-site.xml will mean that you will not do any client hash table generating and will miss out on optimizations for other joins. You should set this in your query directly. Another option is so increase the client side heap to allow

WebHCat TempletonJobController return codes

2015-03-18 Thread Kristopher Kane
Is there a list of possible return codes as logged by the TempletonJobController's map task? I'm getting an RC of 6 for a pig+hcat job that works from the CLI: o.a.h.hcatalog.templeton.tool.launchMapper: templeton: Writing exit value 6 to... -Kris

hive.exec.scratchdir

2016-07-29 Thread Kristopher Kane
Is there a variable that can be used for the user principal in scratchdir instead of the JVM user.name? https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.scratchdir Kris

Potential performance implications of Avro backed table using Snappy

2016-12-16 Thread Kristopher Kane
I see that Hive doesn't seem to know about an Avro SerDe compressed table (Hive 1.2.1) in 'describe extended' when determining compression with the following: SET hive.exec.compress.output=true; SET avro.output.codec=snappy; -- likely because you set those on INSERT and there isn't any DDL

Hive 1.2.1 (HDP) ArrayIndexOutOfBounds for highly compressed ORC files

2018-02-26 Thread Kristopher Kane
I have a highly compressed single ORC file based table generated from Hive DDL. Raw size reports 120GB ORC/Snappy compressed down to 990 MB (ORC with no compression is still only 1.3GB) . Hive on MR is throwing ArrayIndexOutOfBoundsException like the following: Diagnostic Messages for this

Re: Hive 1.2.1 (HDP) ArrayIndexOutOfBounds for highly compressed ORC files

2018-02-27 Thread Kristopher Kane
Gopal. That was exactly it. As always, a succinct, accurate answer. Thanks, -Kris On Mon, Feb 26, 2018 at 8:06 PM, Gopal Vijayaraghavan wrote: > Hi, > > > Caused by: java.lang.ArrayIndexOutOfBoundsException > > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$ >

External table data and Ranger Security (doAs=false)

2019-03-29 Thread Kristopher Kane
If using a default external table location, in a cluster with Ranger Authorization, the table location and data are owned by the `hive` user. Since the table is external, there doesn't seem to be a way to delete this data other than impersonating or becoming the `hive` or `hdfs` principal. Is

Query result cache size

2019-06-03 Thread Kristopher Kane
'hive.query.results.cache.max.size' - Is this limit per query result, total for all users across all HS2 instances or per HS2 instance? Thanks, Kris

Re: Restrict users from creating tables in default warehouse

2019-06-13 Thread Kristopher Kane
Authorization, rather. On Thu, Jun 13, 2019 at 10:51 AM Kristopher Kane wrote: > > You really have no choice with storage based authentication. > > On Fri, Jun 7, 2019 at 12:24 PM Mainak Ghosh wrote: > > > > Hey Alan, > > > > Thanks for replying.

Re: Restrict users from creating tables in default warehouse

2019-06-13 Thread Kristopher Kane
You really have no choice with storage based authentication. On Fri, Jun 7, 2019 at 12:24 PM Mainak Ghosh wrote: > > Hey Alan, > > Thanks for replying. We are currently using storage based authorization and > Hive 2.3.2. Unfortunately, we found that the default warehouse path requires > a 777

JDBC Storage Handler credential protection

2019-05-22 Thread Kristopher Kane
The JDBC storage handler wiki states: "You will need to protect the keystore file by only authorize targeted user to read this file using authorizer (such as ranger). Hive will check the permission of the keystore file to make sure user has read permission of it when creating/altering table." I

Copying non-jar files to the job

2019-08-06 Thread Kristopher Kane
Does anyone have a pointer to how I can copy non-jar files from a storage handler such that they are accessible by the map task executor in usercache? Thanks, Kris

Re: Copying non-jar files to the job

2019-08-06 Thread Kristopher Kane
dCacheFile(new URI("hdfs://tmp/my.truststore")); .. and the Distributed Cache directly but I do not see them in the directly listing of a Tez log. On Tue, Aug 6, 2019 at 1:44 PM Kristopher Kane wrote: > > Does anyone have a pointer to how I can copy non-jar files from a &

Storage handlers and access to files.

2019-07-25 Thread Kristopher Kane
I'm trying to add protected SSL credentials to the Kafka Storage Handler. This is my first jump into the pool. I have it working where the creds for the keystore/truststore are in JCEKS files in HDFS and the KafkaStorageHandler class loads them into the job configuration based on some new