> Is there a way to create an external table on a directory, extract 'key' as
> file name and 'value' as file content and write to a sequence file table?
Do you care that it is a sequence file?
The HDFS HAR format was invented for this particular problem, check if the
"hadoop archive" command
Hello,
I am getting this exception when my query finishes which results in job
failure.
java.sql.SQLException: org.apache.http.NoHttpResponseException: The target
server failed to respond
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)
Any help is appreciated !!
THanks
Hi
I’m not sure how this will solve the issue you were mentioned, but just for the
fun of it –
Here is the code.
Dudu
set textinputformat.record.delimiter='\0';
set hive.mapred.supports.subdirectories=true;
set mapred.input.dir.recursive=true;
create external table if not exists files_ext
The fundamental question is: do you need these recurring updates to
dimension tables throttling your Hive tables.
Besides why bother with ETL when one can do ELT.
For dimension table just add two additional columns namely
, op_type int
, op_time timestamp
op_type = 1/2/3
> Dimensions change, and I'd rather do update than recreate a snapshot.
Slow changing dimensions are the common use-case for Hive's ACID MERGE.
The feature you need is most likely covered by
https://issues.apache.org/jira/browse/HIVE-10924
2nd comment from that JIRA
"Once an hour, a set of
Hi,
Do we have plans to add SSL to connection client - REST service in WebHCat?
Thanks,
Alina
I'm trying to resolve small files issue using Hive.
Is there a way to create an external table on a directory, extract 'key' as
file name and 'value' as file content and write to a sequence file table?
Or any other better option in Hive?
Thank you
Arun
Hi Vijay,
If dimensional tables are reasonable size and frequently updated, then you
can deploy *Spark SQL* to get data directly from your MySQL table through
JDBC and do your join with your fact table stored in Hive.
In general these days one can do better with Spark SQL. Your fact table
still
Dimensions change, and I'd rather do update than recreate a snapshot.
On 23-Sep-2016 17:23, "Markovitz, Dudu" wrote:
> If these are dimension tables, what do you need to update there?
>
>
>
> Dudu
>
>
>
> *From:* Vijay Ramachandran [mailto:vi...@linkedin.com]
> *Sent:*
If these are dimension tables, what do you need to update there?
Dudu
From: Vijay Ramachandran [mailto:vi...@linkedin.com]
Sent: Friday, September 23, 2016 1:46 PM
To: user@hive.apache.org
Subject: Re: on duplicate update equivalent?
On Fri, Sep 23, 2016 at 3:47 PM, Mich Talebzadeh
On Fri, Sep 23, 2016 at 3:47 PM, Mich Talebzadeh
wrote:
> What is the use case for UPSERT in Hive. The functionality does not exist
> but there are other solutions.
>
> Are we talking about a set of dimension tables with primary keys hat need
> to be updated (existing
You may however use a code similar to the following.
The main idea is to work with 2 target tables.
Instead of merging the source table into a target table, we create an
additional target table based of the merge results.
A view is pointing all the time to the most updated target table.
Dudu
Hi Vijay,
What is the use case for UPSERT in Hive. The functionality does not exist
but there are other solutions.
Are we talking about a set of dimension tables with primary keys hat need
to be updated (existing rows) or inserted (new rows)?
HTH
Dr Mich Talebzadeh
LinkedIn *
We’re not there yet…
https://issues.apache.org/jira/browse/HIVE-10924
Dudu
From: Vijay Ramachandran [mailto:vi...@linkedin.com]
Sent: Friday, September 23, 2016 11:47 AM
To: user@hive.apache.org
Subject: on duplicate update equivalent?
Hello.
Is there a way to write a query with a behaviour
Hello.
Is there a way to write a query with a behaviour equivalent to mysql's "on
duplicate update"? i.e., try to insert, and if key exists, update the row
instead?
thanks,
Yes Sekine I am talking about AWS ELB logs in Mumbai region. Let me try
implementing what Andres suggested and I also in a verge of implementing
some other solution as well. I will let you all know once any of the
solution works.
On Sep 23, 2016 1:11 PM, "Sékine Coulibaly"
Manish,
UTC is not a format (but, ISO 8601 is).
Consider UTC as + at the end of a ISO 8601 time.
Eg:
2016-01-01T*23:45:22.943762*+
is stricylt equivalent to :
2016-01-01T*23:45:22.943762Z*
*and is also strictly equivalent to the same time expressed in another
timezone such as
17 matches
Mail list logo