Re: Jira Doc Access

2020-04-14 Thread David Mollitor
Thanks Ashutosh! On Mon, Apr 13, 2020 at 12:27 PM Ashutosh Chauhan wrote: > Hi David, > Added you to Hive wiki. > Thanks, > Ashutosh > > On Mon, Apr 13, 2020 at 6:39 AM David Mollitor wrote: > >> Hello Team, >> >> Is anyone able to grant me ac

Jira Doc Access

2020-04-13 Thread David Mollitor
Hello Team, Is anyone able to grant me access to the Apache Hive Wiki (dmollitor) ? Also, is there any discussion/interest in moving docs into the git repo? Thanks!

Re: Query Failures

2020-02-14 Thread David Mollitor
https://community.cloudera.com/t5/Support-Questions/Map-and-Reduce-Error-Java-heap-space/td-p/45874 On Fri, Feb 14, 2020, 6:58 PM David Mollitor wrote: > Hive has many optimizations. One is that it will load the data directly > from storage (HDFS) if it's a trivial query. For e

Re: Query Failures

2020-02-14 Thread David Mollitor
Hive has many optimizations. One is that it will load the data directly from storage (HDFS) if it's a trivial query. For example: Select * from table limit 10; In natural language it says "give me any ten rows (if available) from the table." You don't need the overhead of launching a full

Re: Why Hive uses MetaStore?

2020-01-15 Thread David Mollitor
In the beginning, hive was a command line tool. All the heavy lifting happened on the user's local box. If a user wanted to execute hive from their laptop, or a server, it always needs access to the list of available tables (and their schemas and their locations), otherwise every SQL script

Re: Alternatives to Streaming Mutation API in Hive 3.x

2020-01-13 Thread David Mollitor
Hello, Streaming? NiFi Upserts? HBase, Kudu, Hive 3.x Doing upserts on Hive can be cumbersome, depending on the use case. If Upserts are being submitted continuously and quickly, it can overwhelm the system because it will require a scan across the data set (for all intents and purposes) for

Re: What is the Hive HA processing mechanism?

2019-11-15 Thread David Mollitor
Hello, Not sure if this answers your question, but please note the following: Processing occurs via MapReduce, Spark, or Tez. The processing engines run on top of YARN. Each processing engine derives much of their HA from YARN. There are some quarks there, but these engines running on YARN is