Removing Hive-on-Spark

2020-07-27 Thread David
Hello Hive Users. I am interested in gathering some feedback on the adoption of Hive-on-Spark. Does anyone care to volunteer their usage information and would you be open to removing it in favor of Hive-on-Tez in subsequent releases of Hive? If you are on MapReduce still, would you be open to

Re: Removing Hive-on-Spark

2020-07-27 Thread David
Hello Stephen, Thanks for your interest. Can you please elaborate a bit more on your question? Thanks. On Mon, Jul 27, 2020 at 4:11 PM Stephen Boesch wrote: > Why would it be this way instead of the other way around? > > On Mon, 27 Jul 2020 at 12:27, David wrote: > >>

Re: Hive Avro: Directly use of embedded Avro Scheme

2020-10-31 Thread David
What would your expectation be? That Hive reads the first file it finds and uses that schema in the table definition? What if the table is empty and a user attempts an INSERT? What should be the behavior? The real power of Avro is not so much that the schema can exist (optionally) in the file

Re: Hive Avro: Directly use of embedded Avro Scheme

2020-10-31 Thread David
> you can easily create a new version. > Is this the idea ? > > Br, > Dennis > -- > *Von:* David > *Gesendet:* Samstag, 31. Oktober 2020 14:52:04 > *An:* user@hive.apache.org > *Betreff:* Re: Hive Avro: Directly use of embedded Avro

Re: Does Hive support data encryption?

2021-03-02 Thread David
Not directly. It relies on the underlying storage layer. For example: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html On Tue, Mar 2, 2021 at 6:34 AM qq <987626...@qq.com> wrote: > Hello: > > Does Hive support data encryption? > > Thank

Re: [EXTERNAL] Re: Any plan for new hive 3 or 4 release?

2021-02-27 Thread David
Hello, My hope has been that Hive 4.x would be built on Java 11. However, I've hit many stumbling blocks over the past year towards this goal. I've been able to make some progress, but several things are still stuck. It mostly stems from the fact that hive has many big-ticket dependencies like

Re: Hive 3 has big performance improvement from my test

2023-01-07 Thread David
I spent some time over the past couple of years making micro optimizations within Avro, Parquet, ORC. Curious to know if there's a way for you all to get timings at different levels of the stack to compare and not just look at the top-line numbers. A further breakdown could also help identify

Number of simultaneous hive users

2010-11-20 Thread David Lary
LOCAL INPATH 'data.txt' INTO TABLE mytable; Will these conflict with each other? Is there a better way to achieve a massive data load from millions of files? Each file is large. Thanks David

Partitioning External table

2010-12-28 Thread David Ginzburg
Hi, I am trying to test creation of an external table using partitions, my files on hdfs are: /user/training/partitions/dt=2/engine /user/training/partitions/dt=2/engine engine are sequence files which I have managed to create externally and query from, when I have not used partitions.

Re: MySQL Metastore migration

2011-01-06 Thread David Burley
and tweak the process for improvement -- please reply back so others can bask in the knowledge. Cheers, David Hi all, We've been running hive in the default derby single user mode for a while. Now we've got more users interested in Hive and so would like to change the metastore to run off

insert - Hadoop vs. Hive

2011-03-30 Thread David Zonsheine
Hi, I'm trying to compare adding files to hdfs for Hive usage using Hive inserts vs. adding to the hdfs directly then using Hive. Any comments, blogging about this? Thanks a lot, David Zonsheine

Re: inconsistent results when doing a select over a join

2012-01-09 Thread David Houston
Hi Guy, Inconsistant by way of the results are total off or the order is different? Thanks Dave On Jan 9, 2012 5:03 PM, Guy Doulberg guy.doulb...@conduit.com wrote: Hi guys, We are using hive for a while now, and recently we have encountered an issue we just can't understand, We are

RE: inconsistent results when doing a select over a join

2012-01-10 Thread David Ginzburg
wrote: Hey Dave, I didn't understand your question, The Inconsistant is slightly different, about 2% of differences, Thanks Guy On 01/09/2012 07:05 PM, David Houston wrote: Hi Guy, Inconsistant by way of the results are total off

Re: Lag function in Hive

2012-04-10 Thread David Kulp
New here. Hello all. Could you try a self-join, possibly also restricted to partitions? E.g. SELECT t2.value - t1.value FROM mytable t1, mytable t2 WHERE t1.rownum = t2.rownum+1 AND t1.partition=foo AND t2.partition=bar If your data is clustered by rownum, then this join should, in theory, be

Re: Lag function in Hive

2012-04-10 Thread David Kulp
t2 ON (t1.rownum = t2.rownum + 1 AND t2.partition=bar) WHERE t1.partition=foo; This should be faster as partition selection will happen earlier. This is still going to involve an awful lot of I/O, and not going to be fast. Phil. On 10 April 2012 15:56, David Kulp dk...@fiksu.com wrote

Re: using the key from a SequenceFile

2012-04-19 Thread David Kulp
I'm trying to achieve something very similar. I want to write an MR program that writes results in a record-based sequencefile that would be directly readable from hive as though it were created using STORED AS SEQUENCEFILE with, say, BinarySortableSerDe. From this discussion it seems that

Re: using the key from a SequenceFile

2012-04-19 Thread David Kulp
of it, other than that you won’t notice the difference between sequence or plain text file From: David Kulp [mailto:dk...@fiksu.com] Sent: Thursday, April 19, 2012 2:13 PM To: user@hive.apache.org Subject: Re: using the key from a SequenceFile I'm trying to achieve something very similar. I want

Re: using the key from a SequenceFile

2012-04-19 Thread David Kulp
should be golden. You can presumably use one of the alternative serializers in your MR program, but I haven't tried it, yet. -d On Apr 19, 2012, at 8:52 AM, David Kulp wrote: But I'm not clear on how to write a single row of multiple values in my MR program, since my only way to output data

Re: Managed vs external tables in hive

2012-05-10 Thread David Kulp
It's simpler than this. All files look the same -- and are often very simple delimited text -- whether managed or external. The only difference is that the files associated with a managed table are dropped when the table is dropped and files that are loaded into a managed table are moved into

Re: changing field delimiter for an existing table?

2012-05-11 Thread David Kulp
Here is the default textfile. Substitute delimiters as necessary. CREATE TABLE ... ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\002' MAP KEYS TERMINATED BY '\003' LINES TERMINATED BY '\n' STORED AS TEXTFILE; On May 11, 2012, at 5:58 PM, Igor Tatarinov

Re: changing field delimiter for an existing table?

2012-05-11 Thread David Kulp
, David Kulp dk...@fiksu.com wrote: Here is the default textfile. Substitute delimiters as necessary. CREATE TABLE ... ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\002' MAP KEYS TERMINATED BY '\003' LINES TERMINATED BY '\n' STORED AS TEXTFILE

Looking for a working mysql import sqoop command line

2012-05-12 Thread David Morel
(preferrably with --direct, since the files I have to put there are quite big) - provide alternative solutions, since maybe I'm going a completely wrong way Thanks a million! David Morel

Re: Newbie - Hive Tutorial question, what format is the sample data file in?

2012-09-01 Thread David Swearingen
are serialized into strings. On Sat, Sep 1, 2012 at 7:52 AM, David Swearingen dswearin...@42six.com wrote: I'm going through the tutorial at https://cwiki.apache.org/Hive/tutorial.html . It's not clear to me what the exact format of the log file would be for the sample queries described eg

Skew join failure

2012-11-30 Thread David Morel
0.8.1. Thanks a lot! David Morel

Re: Skew join failure

2012-12-03 Thread David Morel
On 30 Nov 2012, at 16:46, Mark Grover wrote: Hi David, It seems like Hive is unable to find the skewed keys on HDFS. Did you set *hive.skewjoin.key property? If so, to what value?* Hey Mark, thanks for answering! I didn't set it to anything, but left it at its default value (100,000 IIRC

Mapping existing HBase table with many columns to Hive.

2012-12-06 Thread David Koch
to anything other than binary but maybe the columns - which are longs and the values which are strings can be mapped to their according Hive datatypes. I include an extract of what a row looks like in HBase shell below: Thank you, /David hbase(main):009:0 scan hits ROW COLUMN+CELL \x00

Re: Mapping existing HBase table with many columns to Hive.

2012-12-06 Thread David Koch
Hello Swarnim, Thank you for your answer. I will try the options you pointed out. /David On Thu, Dec 6, 2012 at 9:10 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: map

Re: Mapping existing HBase table with many columns to Hive.

2012-12-09 Thread David Koch
row keys, qualifiers and values share the same data type respectively (for example: row keys are ints, qualifiers are longs and values are strings). Thank you, /David On Thu, Dec 6, 2012 at 9:23 PM, David Koch ogd...@googlemail.com wrote: Hello Swarnim, Thank you for your answer. I will try

Drop an HBase backed table

2012-12-09 Thread David Koch
Hello, How can I drop a Hive table which was created using CREATE EXTERNAL TABLE...? I tried DROP TABLE table_name; but the shell hangs. The underlying HBase table should not be deleted. I am using Hive 0.9 Thank you, /David

Thrift Hive client for CDH 4.1 HiveServer2?

2013-01-03 Thread David Morel
Hi all (and happy New Year!) Is it possible to build a perl Thrift client for HiveServer2 (from Cloudera's 4.1.x) ? I'm following the instructions found here: http://stackoverflow.com/questions/5289164/perl-thrift-client-to-hive Downloaded Hive from Cloudera's site, then i'm a bit lost: where

Re: Thrift Hive client for CDH 4.1 HiveServer2?

2013-01-05 Thread David Morel
) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:215) ... 4 more Where should I start looking (meaning I haven't a clue)? Thanks! David 在 2013-1-4 上午7:16,David Morel dmore...@gmail.com写道: Hi all (and happy New Year!) Is it possible to build a perl Thrift client for HiveServer2

Re: Thrift Hive client for CDH 4.1 HiveServer2?

2013-01-05 Thread David Morel
: here: https://issues.apache.org/jira/browse/HIVE-2935 https://cwiki.apache.org/Hive/hiveserver2-thrift-api.html HiveServer2 now is CDH extension. I think you can use find cmd to search the CDH src dir to find the .thrift files. 2013/1/5 David Morel dmore...@gmail.com On 4 Jan 2013, at 16

An explanation of LEFT OUTER JOIN and NULL values

2013-01-24 Thread David Morel
Hi! After hitting the curse of the last reducer many times on LEFT OUTER JOIN queries, and trying to think about it, I came to the conclusion there's something I am missing regarding how keys are handled in mapred jobs. The problem shows when I have table A containing billions of rows with

Re: An explanation of LEFT OUTER JOIN and NULL values

2013-01-24 Thread David Morel
On 24 Jan 2013, at 18:16, bejoy...@yahoo.com wrote: Hi David An explain extended would give you the exact pointer. From my understanding, this is how it could work. You have two tables then two different map reduce job would be processing those. Based on the join keys, combination

Re: An explanation of LEFT OUTER JOIN and NULL values

2013-01-24 Thread David Morel
On 24 Jan 2013, at 20:39, bejoy...@yahoo.com wrote: Hi David, The default partitioner used in map reduce is the hash partitioner. So based on your keys they are send to a particular reducer. May be in your current data set, the keys that have no values in table are all falling in the same

Re: Real-life experience of forcing smaller input splits?

2013-01-25 Thread David Morel
the mapper, there is simply not enough memory available to it. Since the compression scheme is BLOCK, I expected it would be possible to instruct hive to process only a limited number of fragments instead of everything that's in the file in 1 go. David

Re: Real-life experience of forcing smaller input splits?

2013-01-25 Thread David Morel
for all the answers, everyone! David On Fri, Jan 25, 2013 at 8:46 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Not all files are split-table Sequence Files are. Raw gzip files are not. On Fri, Jan 25, 2013 at 1:47 AM, Nitin Pawar nitinpawar...@gmail.comwrote: set mapred.min.split.size

Re: Avro Backed Hive tables

2013-03-12 Thread David Morel
On 7 Mar 2013, at 2:43, Murtaza Doctor wrote: Folks, Wanted to get some help or feedback from the community on this one: Hello, in that case it is advisable to start a new thread, and not 'reply-to' when you compose your email :-) Have a nice day David

Use Hive reflect() method to call non-static JDK fuctions

2013-03-18 Thread David Lee
It's relatively straight forward to call static functions in JDK using reflect. For example, select reflect(java.lang.Math, max, 2, 3) from mytable limit 1; However, how do I use reflect to call non-static functions (e.g., indexOf() method in java.lang.String class)? None of the following

Re: Partition performance

2013-07-03 Thread David Morel
table. David

Re: Seeking Help configuring log4j for sqoop import into hive

2013-11-11 Thread David Morel
) at org.apache.hadoop.mapred.Child.main(Child.java:262) This is usually the case when your PK (on which Sqoop will try to do the split) isn't an integer. my 2c. David

Re: HiveServer2

2013-11-19 Thread David Morel
in Hive (nothing more). Can anyone confirm that behaviour? David

Re: Hive query taking a lot of time just to launch map-reduce jobs

2013-11-25 Thread David Morel
, and the average? David

Re: java.lang.OutOfMemoryError: Java heap space

2013-11-25 Thread David Morel
On 22 Nov 2013, at 9:35, Rok Kralj wrote: If anybody has any clue what is the cause of this, I'd be happy to hear it. On Nov 21, 2013 9:59 PM, Rok Kralj rok.kr...@gmail.com wrote: what does echo $HADOOP_HEAPSIZE return in the environment you're trying to launch hive from? David

Re: Difference in number of row observstions from distinct and group by

2013-11-25 Thread David Morel
outer join of table 1 on table 2. you'd be able to identify quickly what went wrong. Sort the result so you get unlikely dupes, and all. Just trial and error until you nail it. David

Re: Hive query taking a lot of time just to launch map-reduce jobs

2013-11-26 Thread David Morel
On 26 Nov 2013, at 7:02, Sreenath wrote: Hey David, Thanks for the swift reply. Each id will have exactly one file. and regarding the volume on an average each file would be 100MB of compressed data with the maximum going upto around 200MB compressed data. And how will RC files

MIN/MAX issue with timestamps and RCFILE/ORC tables

2013-12-06 Thread David Engel
03:19:42.726 | 2013-09-06 21:01:07.743 | | spreadsheets2.google.com | 7 | 9 | 2013-09-06 03:19:42.726 | 2013-09-06 13:13:19.84 | +++-+--+--+ David -- David Engel da...@istwok.net

Re: Issue with Hive and table with lots of column

2014-01-30 Thread David Gayou
by row basis on those dataset, so basically the more column we have the better it is. We are coming from the SQL world, and Hive is the closest to SQL syntax. We'd like to keep some SQL manipulation on the data. Thanks for the Help, Regards, David Gayou On Tue, Jan 28, 2014 at 8:35 PM, Stephen

Re: Issue with Hive and table with lots of column

2014-01-31 Thread David Gayou
size) My usecase is really to have the most possible columns. Thanks a lot for your help Regards David On Fri, Jan 31, 2014 at 1:12 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Ok here are the problem(s). Thrift has frame size limits, thrift has to buffer rows into memory. Hove

Re: Issue with Hive and table with lots of column

2014-02-18 Thread David Gayou
and hiveserver1. It fails with hiveserver 2. Regards David Gayou On Thu, Feb 13, 2014 at 3:11 AM, Navis류승우 navis@nexr.com wrote: With HIVE-3746, which will be included in hive-0.13, HiveServer2 takes less memory than before. Could you try it with the version in trunk? 2014-02-13 10:49 GMT+09

Re: Issue with Hive and table with lots of column

2014-02-18 Thread David Gayou
1. I have no process with hiveserver2 ... ps -ef | grep -i hive return some pretty long command with a -Xmx8192 and that's the value set in hive-env.sh 2. The select * from table limit 1 or even 100 is working correctly. David. On Tue, Feb 18, 2014 at 4:16 PM, Stephen Sprague sprag

Re: Issue with Hive and table with lots of column

2014-02-18 Thread David Gayou
Sorry i badly reported it. It's 8192M Thanks, David. Le 18 févr. 2014 18:37, Stephen Sprague sprag...@gmail.com a écrit : oh. i just noticed the -Xmx value you reported. there's no M or G after that number?? I'd like to see -Xmx8192M or -Xmx8G. That *is* very important. thanks, Stephen

Deserializing into multiple records

2014-04-01 Thread David Quigley
We are currently streaming complex documents to hdfs with the hope of being able to query. Each single document logically breaks down into a set of individual records. In order to use Hive, we preprocess each input document into a set of discreet records, which we save on HDFS and create an

Re: Deserializing into multiple records

2014-04-02 Thread David Quigley
Makes perfect sense, thanks Petter! On Wed, Apr 2, 2014 at 2:15 AM, Petter von Dolwitz (Hem) petter.von.dolw...@gmail.com wrote: Hi David, you can implement a custom InputFormat (extends org.apache.hadoop.mapred.FileInputFormat) accompanied by a custom RecordReader (implements

Re: Deserializing into multiple records

2014-04-03 Thread David Quigley
but nothing I saw actually decomposes nested JSON into a set of discreet records. Its super useful for us. On Wed, Apr 2, 2014 at 2:15 AM, Petter von Dolwitz (Hem) petter.von.dolw...@gmail.com wrote: Hi David, you can implement a custom InputFormat (extends

Re: Deserializing into multiple records

2014-04-08 Thread David Quigley
, Petter 2014-04-04 6:02 GMT+02:00 David Quigley dquigle...@gmail.com: Thanks again Petter, the custom input format was exactly what I needed. Here is example of my code in case anyone is interested https://github.com/quicklyNotQuigley/nest Basically gives you SQL access

Re: get_json_object for nested field returning a String instead of an Array

2014-04-08 Thread David Quigley
Hi Narayanan, We have had some success with a similar use case using a custom input format / record reader to recursively split arbitrary json into a set of discreet records at runtime. No schema is needed. Doing something similar might give you the functionality you are looking for.

Re: What is the minimal required version of Hadoop for Hive 0.13.0?

2014-04-23 Thread David Gayou
Is it now the minimal required version ? If not, will there be a Hive 0.13.1 for older hadoop? Regards, David On Wed, Apr 23, 2014 at 4:00 PM, Dmitry Vasilenko dvasi...@gmail.comwrote: Hive 0.12.0 (and previous versions) worked with Hadoop 0.20.x, 0.23.x.y, 1.x.y, 2.x.y. Hive 0.13.0 did

Problem adding jar using pyhs2

2014-04-25 Thread David Engel
and Beeline. It seems the add part of any add file|jar|archive ... command needs to get stripped off somewhere before it gets passed to AddResourceProcessor.run(). Unfortunately, I can't find that location when the command is received from pyhs2. Can someone help? David -- David Engel da

Re: Problem adding jar using pyhs2

2014-04-28 Thread David Engel
expects jar file.jar to get passed to it. That's how it appears to work when add jar file.jar is run from a stand-alone Hive CLI and from beeline. David On Sat, Apr 26, 2014 at 12:14:53AM -0700, Brad Ruderman wrote: An easy solution would be to add the jar to the classpath or auxlibs therefore

Cannot Upgrade a Hive UDF without cluster restart. UDF is possibly cached.

2014-04-28 Thread David Zaebst
Hi all, We have a few Hive UDFs where I work. These are deployed by a bootstrap script so that the JAR files are in Hive's CLASSPATH before the server starts. This works to load the UDF whenever a cluster is started and then the UDF can be loaded with the ADD JAR and CREATE TEMPORARY FUNCTION

Re: Problem adding jar using pyhs2

2014-04-29 Thread David Engel
Hi Brad, Your test, after edting for local host/file names, etc. worked. It must be something else I'm doing wrong in my development stuff. At least I know it should work. I'll figure it out eventually. Thanks again. David On Mon, Apr 28, 2014 at 10:22:57AM -0700, Brad Ruderman wrote: Hi

Altering the Metastore on EC2

2014-08-11 Thread David Beveridge
We are creating an Hive schema for reading massive JSON files. Our JSON schema is rather large, and we have found that the default metastore schema for Hive cannot work for us as-is. To be specific, one field in our schema has about 17KB of nested structs within it. Unfortunately, it appears

using Hive to create tables from unstructured data.

2014-11-12 Thread David Novogrodsky
STRING, timeOfCall STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( input.regex = ([^\t]*)\t([^\t]*)\t([^\t]*)\t([^\t]*)\t([^\t]*)\n, output.format.string = %1$s %2$s %3$s %4$s %5$s ) LOCATION '/user/cloudera/vector/callRecords'; David

Using xPATH and Hive SQL to access XML data, but xPath a problem

2014-12-08 Thread David Novogrodsky
David Novogrodsky david.novogrod...@gmail.com http://www.linkedin.com/in/davidnovogrodsky

Hive Transactions fail

2015-03-05 Thread David Simoes
Ive had some troubles enabling transactions in Hive 1.0.0 and Ive made a post in http://stackoverflow.com/questions/28867368/hive-transactions-are-crashing Could anyone check it out and give me some pointers on why things are crashing? Tyvm, Dave

connection pooling for hive JDBC client

2015-06-03 Thread McWhorter, David
to interact with and query Hive through the JDBC api from an application. Thank you, David McWhorter — David McWhorter Senior Developer, Foundations Informatics and Technology Services Office: 434.260.5232 | Mobile: 434.227.2551 david_mcwhor...@premierinc.commailto:david_mcwhor...@premierinc.com

Re: Hive Data into a Html Page

2015-07-31 Thread David Morel
Hive is not really meant to serve data as fast as a web page needs. You'll have to use some intermediate (could even be a db file, or template toolkit generated static pages). David Le 28 juil. 2015 8:53 AM, siva kumar siva165...@gmail.com a écrit : Hi Lohith, We use http

External sorted tables

2015-07-30 Thread David Capwell
We are trying to create a external table in hive. This data is sorted, so wanted to tell hive about this. When I do, it complains about parsing the create. CREATE EXTERNAL TABLE IF NOT EXISTS store.testing ( ... . . . . . . . . . . . . . . . . . . . timestamp bigint, ...) . . . . . . . . . . .

RE: External sorted tables

2015-08-03 Thread David Capwell
that the data **is** in fact sorted... If there is something specific you are trying to accomplish by specifying the sort order of that column, perhaps you can elaborate on that. Otherwise, leave out the 'sorted by' statement and you should be fine. *From:* David Capwell [mailto:dcapw

RE: External sorted tables

2015-08-03 Thread David Capwell
. *From:* David Capwell [mailto:dcapw...@gmail.com] *Sent:* Monday, August 03, 2015 11:59 AM *To:* user@hive.apache.org *Subject:* RE: External sorted tables Mostly wanted to tell hive it's sorted so it could use more efficient joins like a map side join. No other reason On Aug 3, 2015 10:47 AM

Re: External sorted tables

2015-08-03 Thread David Capwell
to insert data correctly by specifying the number of reducers to be equal to the number of buckets, and using CLUSTER BY and SORT BY commands in their query. On Thu, Jul 30, 2015 at 7:22 PM, David Capwell dcapw...@gmail.com wrote: We are trying to create a external table in hive. This data

Re: Perl-Hive connection

2015-07-30 Thread David Morel
/lib/Thrift/API/HiveClient2.pm David

Re: Perl-Hive connection

2015-08-06 Thread David Morel
You probably forgot to load (use) the module before calling new() Le 6 août 2015 8:49 AM, siva kumar siva165...@gmail.com a écrit : Hi David , I have tried the link you have posted. But im stuck with this error message below Can't locate object method new via package

ORC NPE while writing stats

2015-09-01 Thread David Capwell
We are writing ORC files in our application for hive to consume. Given enough time, we have noticed that writing causes a NPE when working with a string column's stats. Not sure whats causing it on our side yet since replaying the same data is just fine, it seems more like this just happens over

Re: ORC NPE while writing stats

2015-09-03 Thread David Capwell
Thanks, that should help moving forward On Sep 3, 2015 10:38 AM, "Prasanth Jayachandran" < pjayachand...@hortonworks.com> wrote: > > > On Sep 2, 2015, at 10:57 PM, David Capwell <dcapw...@gmail.com> wrote: > > > > So, very quickly looked at the JIRA a

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
Also, the data put in are primitives, structs (list), and arrays (list); we don't use any of the boxed writables (like text). On Sep 2, 2015 12:57 PM, "David Capwell" <dcapw...@gmail.com> wrote: > We have multiple threads writing, but each thread works on one file, so > orc

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
, Sep 2, 2015 at 7:34 PM, David Capwell <dcapw...@gmail.com> wrote: > Thanks for the jira, will see if that works for us. > > On Sep 2, 2015 7:11 PM, "Prasanth Jayachandran" > <pjayachand...@hortonworks.com> wrote: >> >> Memory manager is made thread local

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
ing for me, so no issue sharding and not configuring? Thanks for your time reading this email! On Wed, Sep 2, 2015 at 8:57 PM, David Capwell <dcapw...@gmail.com> wrote: > So, very quickly looked at the JIRA and I had the following question; > if you have a pool per thread rather than global

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
s.memory). We may be missing a synchronization on the > MemoryManager somewhere and thus be getting a race condition. > > Thanks, >Owen > > On Wed, Sep 2, 2015 at 12:57 PM, David Capwell <dcapw...@gmail.com> wrote: > >> We have multiple threads writing, but each thread works o

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
-10191 and see if that helps? > > On Sep 2, 2015, at 8:58 PM, David Capwell <dcapw...@gmail.com> wrote: > > I'll try that out and see if it goes away (not seen this in the past 24 > hours, no code change). > > Doing this now means that I can't share the memory, so will

Re: ORC NPE while writing stats

2015-09-02 Thread David Capwell
is that estimateStripeSize won't always give the correct value since my thread is the one calling it... With everything ThreadLocal, the only writers would be the ones in the same thread, so should be better. On Wed, Sep 2, 2015 at 9:47 PM, David Capwell <dcapw...@gmail.com> wrote: >

Network throughput from HiveServer2 to JDBC client too low

2016-06-20 Thread David Nies
ork throughput? Thank you in advance! Yours David Nies Entwickler Business Intelligence ADITION technologies AG Oststraße 55, D-40211 Düsseldorf Schwarzwaldstraße 78b, D-79117 Freiburg im Breisgau T +49 211 987400 30 F +49 211 987400 33 E david.n...@adition.com <mailto:david.n...@aditi

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread David Nies
In my test case below, I’m using `beeline` as the Java application receiving the JDBC stream. As I understand, this is the reference command line interface to Hive. Are you saying that the reference command line interface is not efficiently implemented? :) -David Nies > Am 20.06.2016 um 17

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread David Nies
help fix those codepaths as part of > the joint effort with the ODBC driver teams. I’ll see what I can do. I can’t restart the server at will though, since other teams are using it as well. > > Cheers, > Gopal > Thank you :) -David

Re: Network throughput from HiveServer2 to JDBC client too low

2016-06-21 Thread David Nies
in size. > > JDBC on its own should work. Is this an ORC table? > > What version of Hive are you using? Kindly find the answer to these questions in my first eMail :) > > HTH -David > > > > > > Dr Mich Talebzadeh > > LinkedIn > https

Re: read-only mode for hive

2016-03-09 Thread David Capwell
Could always set the tables output format to be the null output format On Mar 8, 2016 11:01 PM, "Jörn Franke" wrote: > What is the use case? You can try security solutions such as Ranger or > Sentry. > > As already mentioned another alternative could be a view. > > > On 08

Re: Hive Metadata tables of a schema

2016-04-05 Thread David Morel
Better use HCatalog for this. David Le 5 avr. 2016 10:14, "Mich Talebzadeh" <mich.talebza...@gmail.com> a écrit : > So you want to interrogate Hive metastore and get information about > objects for a given schema/database in Hive. > > These info are kept in Hiv

ORC tables failing after upgrading from 0.14 to 2.1.1

2017-05-05 Thread David Capwell
Our schema is nested with top level having 5 struct types. When we try to query these structs we get the following back *ORC does not support type conversion from file type string (1) to reader type array (1)* Walking through hive in a debugger I see that schema evolution sees the correct file

Re: Roaring Bitmap UDFs

2017-12-08 Thread David Capwell
Think bloom filter that's more dynamic. It works well when cardinality is low, but grows quickly to out cost bloom filter as cardinality grows. This data structure supports existence queries, but your email sounds like you want count. If so not really the best fit. On Dec 8, 2017 5:00 PM,

Re: Creating temp tables in select statements

2019-03-28 Thread David Lavati
ble insertion you can use a syntax somewhat similar to VALUES https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingvaluesintotablesfromSQL Kind Regards, David On Wed, Mar 27, 2019 at 12:40 AM Mainak Ghosh wrote: > Hello, > > We want to create

Re: Read Hive ACID tables in Spark or Pig

2019-03-11 Thread David Morin
to get the valid transaction for each table from Hive Metastore and, then, read all related files. Is it correct ? Thanks, David Le dim. 10 mars 2019 à 01:45, Nicolas Paris a écrit : > Thanks Alan for the clarifications. > > Hive has made such improvements it has lost its ol

Re: How to update Hive ACID tables in Flink

2019-03-12 Thread David Morin
this case, though it only handles insert (not update), > so if you need updates you'd have to do the merge as you are currently > doing. > > Alan. > > On Mon, Mar 11, 2019 at 2:09 PM David Morin > wrote: > >> Hello, >> >> I've just implemented a pipeline ba

Re: How to update Hive ACID tables in Flink

2019-03-12 Thread David Morin
Tue, Mar 12, 2019 at 12:24 PM David Morin > wrote: > >> Thanks Alan. >> Yes, the problem is fact was that this streaming API does not handle >> update and delete. >> I've used native Orc files and the next step I've planned to do is the >> use of ACID support

How to update Hive ACID tables in Flink

2019-03-11 Thread David Morin
that contain these delta Orc files. Then, MERGE INTO queries are executed periodically to merge data into the Hive target table. It works pretty well but we want to avoid the use of these Merge queries. How can I update Orc files directly from my Flink job ? Thanks, David

Wiki Write Access

2019-02-07 Thread David M
All, I'd like to get wiki write access for the Apache Hive wiki, so I can update some documentation based on a recent patch. My confluence name is mcginnda. Thanks! David McGinnis

RE: Wiki Write Access

2019-02-10 Thread David M
I realized I mistyped my username My confluence username is mcginnisda. Please give me write access to the Hive confluence wiki, or tell me where I need to request it. Thanks! From: David M Sent: Thursday, February 7, 2019 10:38 AM To: user@hive.apache.org Subject: Wiki Write Access All

Orc files in hdf: NullPointerException (RunLengthIntegerReaderV2)

2019-02-11 Thread David Morin
Hello, I face to one error when I try to read my Orc files from Hive (external table) or Pig or with hive --orcfiledump .. These files are generated with Flink using the Orc Java API with Vectorize column. If I create these files locally (/tmp/...), push them to hdfs, then I can read the content

S3 with Tez Performance Issues?

2019-07-01 Thread David M
based on the number of files, but only if the files are located in S3. Can someone confirm this? If this is the case, is there a JIRA tracking a fix, or documentation on why this has to be this way? If not, how can I make sure we use more mappers in cases like above? Thanks! David McGinnis

Hive Major Compaction fails (cleaning step)

2019-08-25 Thread David Morin
Hello, I've been trying "ALTER TABLE (table_name) COMPACT 'MAJOR'" on my Hive 2 environment, but it always fails (HDP 2.6.5 precisely). It seems that the merged base file is created but the delta is not deleted. I found that it was because the HiveMetastore Client can't connect to the metastore

  1   2   >