hi,
what are the drawbacks of ingesting large quantities of data through jdbc
connection in Hive versus placing data directly into HDFS for Hive to
incorporate?
thanks
You are very kind Sir
On Sunday, 30 October 2016, 16:42, Devopam Mittra wrote:
+1
Thanks and regards
Devopam
On 30 Oct 2016 9:37 pm, "Mich Talebzadeh" wrote:
Enjoy the festive season.
Regards,
Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/ profile/view?id=
AAEWh2gBxianrbJd
Thanks Mich looking forward to it :)
On Tuesday, 19 July 2016, 19:13, Mich Talebzadeh
wrote:
Hi all,
This will be in London tomorrow Wednesday 20th July starting at 18:00 hour for
refreshments and kick off at 18:30, 5 minutes walk from Canary Wharf Station,
Jubilee Line
If you wish y
Hive on TEZ + LLAP I would recommend a distribution such as
Hortonworks were everything is already configured. As far as I know llap is
currently not part of any distribution.
On 15 Jul 2016, at 17:04, Ashok Kumar wrote:
Hi, Has anyone managed to make Hive work with Tez + LLAP as the query
Hi,
Has anyone managed to make Hive work with Tez + LLAP as the query engine in
place of Map-reduce please?
If you configured it yourself which version of Tez and LLAP work with Hive 2.
Do I need to build Tez from source for example
Thanks
Hi Gurus,
Advice appreciated from Hive gurus.
My colleague has been using Cassandra. However, he says it is too slow and not
user friendly/MongodDB as a doc databases is pretty neat but not fast enough
May main concern is fast writes per second and good scaling.
Hive on Spark or Tez?
How about Hb
Hi Mich,
Your recent presentation in London on this topic "Running Spark on Hive or Hive
on Spark"
Have you made any more interesting findings that you like to bring up?
If Hive is offering both Spark and Tez in addition to MR, what stopping one not
to use Spark? I still don't get why TEZ + LLAP
Thanks.
Will this presentation recorded as well?
Regards
On Wednesday, 6 July 2016, 22:38, Mich Talebzadeh
wrote:
Dear forum members
I will be presenting on the topic of "Running Spark on Hive or Hive on Spark,
your mileage varies" in Future of Data: London DetailsOrganized by:
Horton
Hi,
Looking at this presentation Hive on Spark is Blazing Fast ..
Which latest version of Spark can run as an engine for Hive please?
Thanks
P.S. I am aware of Hive on TEZ but that is not what I am interested here please
Warmest regards
Greeting gurus,
When I use
ANALYZE TABLE COMPUTE STATISTICS for COLUMNS,
Where can I get the last stats time.
DESC FORMATTED does not show it
thanking you
Hi Dr Mich,
This is very good news. I will be interested to know how Hive engages with
Spark as an engine. What Spark processes are used to make this work?
Thanking you
On Monday, 23 May 2016, 19:01, Mich Talebzadeh
wrote:
Have a look at this thread
Dr Mich Talebzadeh LinkedIn
https
Hi,
Anyone has suggestions how to create and copy Hive and Spark tables from
Production to UAT.
One way would be to copy table data to external files and then move the
external files to a local target directory and populate the tables in target
Hive with data.
Is there an easier way of doing so?
Hi gurus,
I have an ORC table bucketed on invoicenumber with "transactional"="true"
I am trying to update invoicenumber column used for bucketing this table but it
comes back with
Error: Error while compiling statement: FAILED: SemanticException [Error
10302]: Updating values of bucketing column
Dr Mitch,
My two cents here.
I don't have direct experience of Impala but in my humble opinion I share your
views that Hive provides the best metastore of all Big Data systems. Looking
around almost every product in one form and shape use Hive code somewhere. My
colleagues inform me that Hive i
based.
John
From: Ashok Kumar
Reply-To: "user@hive.apache.org" , Ashok Kumar
Date: Wednesday, February 3, 2016 at 11:45 AM
To: User
Subject: Hive optimizer
Hi,
Is Hive optimizer a cost based Optimizer (CBO) or a rule based optimizer (CBO)
or none of them.
thanks
Hi,
Is Hive optimizer a cost based Optimizer (CBO) or a rule based optimizer (CBO)
or none of them.
thanks
employees, unless expressly
so stated. It is the responsibility of the recipient to ensure that this email
is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor
their employees accept any responsibility. From: Ashok Kumar
[mailto:ashok34...@yahoo.com]
Sent: 31 Januar
Thanks,
Can sqoop create this table as ORC in Hive?
On Sunday, 31 January 2016, 13:13, Ashok Kumar wrote:
Thanks.
Can sqoop create this table as ORC in Hive?
On Sunday, 31 January 2016, 13:11, Nitin Pawar
wrote:
check sqoop
On Sun, Jan 31, 2016 at 6:36 PM, Ashok Kumar
Hi,
What is the easiest method of importing data from an Oracle 11g table to Hive
please? This will be a weekly periodic job. The source table has 20 million
rows.
I am running Hive 1.2.1
regards
Thanks Owen,
I got a bit confused comparing ORC with what I know about indexes in relational
databases. Still need to understand it a bit better.
Regards
From: Owen O'Malley [mailto:omal...@apache.org]
Sent: 19 January 2016 17:57
To: user@hive.apache.org; Ashok Kumar
Cc: Jörn Franke
Su
group. The reader takes a SearchArgument (eg. age > 100) that limits which
rows are required for the query and can avoid reading an entire file, or at
least sections of the file.
.. Owen
On Tue, Jan 19, 2016 at 7:50 AM, Ashok Kumar wrote:
Hi,
I have read some notes on ORC files in Hive and
Hi,
I have read some notes on ORC files in Hive and indexes.
The document describes in the indexes but makes reference to statistics
Indexes
| |
| | | | | | | |
| IndexesIndexes ORC provides three level of indexes within each file: file
level - statistics about the values in each c
Hi,
Is there an equivalent to Microsoft IDENTITY column in Hive please.
Thanks and regards
hi,
what is the equivalent to foreign keys in Hive?
Thanks
5347372295 div.yiv5347372295WordSection1
{}#yiv5347372295 Very well answered by Mich. Thanks Mich !! From: Mich
Talebzadeh [mailto:m...@peridale.co.uk]
Sent: Sunday, January 03, 2016 8:35 PM
To: user@hive.apache.org; 'Ashok Kumar'
Subject: RE: Immutable data in Hive Hi Ashok. I will have a g
Any comments on ELT will be greatly appreciated gurus.
With warmest greetings
On Wednesday, 30 December 2015, 18:20, Ashok Kumar
wrote:
Tank you sir, very helpful. Could you also briefly describe from your
experience the major differences between traditional ETL in DW and ELT in
are less common there they are still required
(slow changing dimensions, fixing wrong data, deleting records for compliance,
etc.) Also streaming data into warehouses from transactional systems is a
common use case.
Alan.
Ashok Kumar December 29, 2015 at 14:59 Hi,
Can someone please
Hi,
Can someone please clarify what "immutable data" in Hive means?
I have been told that data in Hive is/should be immutable but in that case why
we need transactional tables in Hive that allow updates to data.
thanks and greetings
RC applies to Parquet as well). If you are doing
queries that select the whole row each time columnar formats like ORC won't be
your friend. Also, if you are storing self structured data such as JSON or
Avro you may find text or Avro storage to be a better format.
Alan.
Ashok Kum
Hi Gurus,
I am trying to understand the advantages that ORC file format offers over RC.
I have read the existing documents but I still don't seem to grasp the main
differences.
Can someone explain to me as a user where ORC scores when compared to RC. What
I like to know is mainly the performance
you know more
details, both sides decide on potential ways forward you can start doing PoCs
and see what works and what not. It is important that you break old ties
created by more traditional data warehouse approaches in the past and go beyond
the comfort zone.
On 18 Dec 2015, at 22:01, Ashok
t), please contact the sender by reply
email and delete all copies of this message.Please click here for Company
Registration Information. |
|
From: Ashok Kumar
Reply-To: User , Ashok Kumar
Date: Friday, December 18, 2015 at 4:01 PM
To: User
Subject: The advantages of Hive/Hadoop comnpar
Gurus,
Some analysts keep asking me the advantages of having Hive tables when the star
schema in Data Warehouse (DW) does the same.
For example if you have fact and dimensions table in DW and just import them
into Hive via a say SQOOP, what are we going to gain.
I keep telling them storage econo
This is great news sir. It shows perseverance pays at last.
Can you inform us when the write-up is ready so I can set it up as well please.
I know a bit about the advantages of having Hive using Spark engine. However,
the general question I have is when one should use Hive on spark as opposed to
Hi,
I would like to know best practices to monitor the health and performance of
Hive and hive server, trouble shooting and catching errors etc.
to be clear we do not use any bespoke monitoring tool and keen on developing
our own in house tools to be integrated into general monitoring tools to b
Hi gurus,
What is the easiest way of exporting one database in Hive in Test and importing
it to say DEV database in another instance.
How metastore at target will handle this please?
Thanks
mber 2015, 13:03, Binglin Chang
wrote:
Hive transparently translates queries into MapReduce jobs that are executed in
HBase
I think this is not correct, are you sure it is from some book?
On Tue, Nov 10, 2015 at 6:56 PM, Ashok Kumar wrote:
hi,
I have read in a book about Hadoop that
hi,
I have read in a book about Hadoop that says
Apache Hive is a data warehouse infrastructure built on top of Hadoop for
providing data summary, ad hoc queries, and the analysis of large data sets
using an SQL-like language called HiveQL.
Hive transparently translates queries into MapReduce j
Hi,
I would like to understand a bit more about insert/update/select operations in
Hive.
I believe ORC table format offers the best performance and concurrency.
Is ORC the best format for DML operations. What is the granularity of locks? Is
it partition or row.
Also how ACID properties implement
Thank you sir. Very helpful
On Thursday, 29 October 2015, 15:22, Alan Gates
wrote:
Ashok Kumar October 28, 2015 at 22:43 hi gurus,
kindly clarify the following please
- Hive currently does not support indexes or indexes are not used in the
query
Mostly true. There
hi gurus,
kindly clarify the following please
- Hive currently does not support indexes or indexes are not used in the
query
- The lowest granularity for concurrency is partition. If table is
partitioned, then partition will be lucked in DML operation
- What is the best file format t
Hi gurus,
I can use Sqoop import to get RDBMS data say Oracle to Hive first and then use
incremental append for new rows with PK and last value.
However, how do you account for updates and deletes with Sqoop without full
load of table from RDBMS to Hive?
Thanks
Hi gurus,
Kindly help me understand the advantage that Impala has over Hive.
I read a note that Impala does not use MapReduce engine and is therefore very
fast for queries compared to Hive. However, Hive as I understand is widely used
everywhere!
Thank you
t any responsibility. From:
Ashok Kumar [mailto:ashok34...@yahoo.com]
Sent: 10 April 2015 17:46
To: user@hive.apache.org
Subject: partition and bucket
| Greeting all,
Glad to join the user group. I am from DBA background Oracle/Sybase/MSSQL.
I would like to understand partition and bucketing in Hive an
44 matches
Mail list logo