Re: ODBC-hiveserver2 question

2018-02-23 Thread Andrew Sears

Add JAR works with HDFS, though perhaps not with ODBC drivers.ADD JAR hdfs://:8020/hive_jars/hive-contrib-2.1.1.jar should work (depending on your nn port and confirm this file exists)Alternative syntaxADD JAR hdfs:/hive_jars/hive-contrib-2.1.1.jarThe ODBC driver could be having an issue with the forward slashes.The guaranteed method is to create a permanent association by adding the JAR to hive/lib or hadoop/lib on hiveserver2 node.Copying to hive-client/auxlib/ and restarting Hive is an option.Adding following property to Hive-env.sh is an optionHIVE_AUX_JARS_PATH=There may be a trace function for your ODBC driver to see a more detailed error.  Some ODBC drivers may not support the ADD JAR syntax.cheers,AndrewOn February 23, 2018 at 3:27 PM Jörn Franke  wrote:   Add jar works only with local files on the Hive server.On 23. Feb 2018, at 21:08, Andy Srine < andy.sr...@gmail.com> wrote:  Team,Is ADD JAR from HDFS (ADD JAR hdfs:///hive_jars/hive-contrib-2.1.1.jar;) supported in hiveserver2 via an ODBC connection? Some relevant points:I am able to do it in Hive 2.1.1 via JDBC (beeline), but not via an ODBC client.In Hive 1.2.1, I can add a jar from the local node, but not a JAR on HDFS.Some old blogs online say HiveServer2 doesn't support "ADD JAR " period. But thats not what I experience via beeline.Let me know your thoughts and experiences.Thanks,Andy 
 


Re: Need help with query

2016-09-22 Thread Andrew Sears

Hi there,

The detailed error should be in the hiveserver2.log


Cheers, Andrew On Wed, Sep 21, 2016 at 3:36 PM, Igor Kravzov < 
igork.ine...@gmail.com [igork.ine...@gmail.com] > wrote:
I run MSCK REPAIR TABLE mytable; and got Error while processing statement: 
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask


On Mon, Sep 12, 2016 at 6:56 PM, Lefty Leverenz < leftylever...@gmail.com 
[leftylever...@gmail.com] > wrote:
Here's a list of the wikidocs about dynamic partitions 
[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DynamicPartitions] 
.


-- Lefty


On Mon, Sep 12, 2016 at 3:25 PM, Devopam Mittra < devo...@gmail.com 
[devo...@gmail.com] > wrote:
Kindly learn dynamic partition from cwiki. That will be the perfect 
solution to your requirement in my opinion.

Regards
Dev


On 13 Sep 2016 12:49 am, "Igor Kravzov" < igork.ine...@gmail.com 
[igork.ine...@gmail.com] > wrote:

Hi,
I have a query like this one
alter table my_table add if not exists partition (mmdd=20160912) 
location '/mylocation/20160912';
Is it possible to make so I don't have to change date every day? Something 
with CURRENT_DATE;?

Thanks in advance.

HiveServer2 thrift service thread pool error

2016-09-13 Thread Andrew Sears

Hi everyone,We have Hive 1.2.1.2.3 Thrift service installed with Atlas Plugin, Ranger Plugin. After some days, we exhaust the running threads, receive an error such as below in the logs and the service stops responding, requiring a restart.org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Hive Internal Error: java.util.concurrent.RejectedExecutionException(Task java.util.concurrent.FutureTask@1c9f4873 rejected from java.util.concurrent.ThreadPoolExecutor@1bacbbcc[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 345])Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@1c9f4873 rejected from java.util.concurrent.ThreadPoolExecutor@1bacbbcc[Running, pool size = 1, active threads = 1, queued tasks = 1, completed tasks = 345]Does anyone have further information on troubleshooting this issue or means to determine what is exhausting the thread pool?thanks,Andrew


Re: Some dates add/less a day...

2016-07-30 Thread Andrew Sears

It is HIVE-13948.https://github.com/apache/hive/commit/da3ed68eda10533f3c50aae19731ac6d059cda87https://issues.apache.org/jira/browse/HIVE-13948Regards,AndrewOn July 29, 2016 at 6:44 PM Julián Arocena <jaroc...@temperies.com> wrote:Hey, thank you so much! I was going crazy, you can image it :)Please let me know if you have it.I will have a nice weekend with this newsBest regards,El 29/7/2016 18:44, "Andrew Sears" <andrew.se...@analyticsdream.com> escribió:Hi there,This is a critical bug fixed by a JIRA, will see if I can get the number for you. It involves patching lib/hive-* files.Cheers, Andrew On Fri, Jul 29, 2016 at 4:37 PM, Julián Arocena <jaroc...@temperies.com> wrote:Hi,I´m having a problem with some dates using external tables to a text file. Let me give you an example:file content:1946-10-011946-10-02table:create external table date_issue_test(date_test Date)ROW FORMAT DELIMITEDFIELDS TERMINATED BY '\001'LINES TERMINATED BY '\n'STORED AS TEXTFILELOCATION '/user/hive/test';Select * from date_issue_test;OK1946-10-021946-10-02As you can see in this case it adds a day, there are a few cases like this.Also I tried with a CAST and a fixed date as bellow :hive> select CAST('1946-10-01' as date) from date_issue_test limit 1;OK1946-10-02Any idea to help me?Thank you so much!Julian 


Re: Some dates add/less a day...

2016-07-29 Thread Andrew Sears
Hi there,

This is a critical bug fixed by a JIRA, will see if I can get the number for 
you. It involves patching lib/hive-* files.

Cheers,
Andrew

On Fri, Jul 29, 2016 at 4:37 PM, Julián Arocena < jaroc...@temperies.com 
[jaroc...@temperies.com] > wrote:
Hi,
I´m having a problem with some dates using external tables to a text file. Let 
me give you an example:
file content: 1946-10-01 1946-10-02
table: create external table date_issue_test (
date_test Date
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\n' 
STORED AS TEXTFILE LOCATION '/user/hive/test';
Select * from date_issue_test;
OK 1946-10-02 1946-10-02
As you can see in this case it adds a day, there are a few cases like this.
Also I tried with a CAST and a fixed date as bellow :
hive> select CAST('1946-10-01' as date) from date_issue_test limit 1; OK 
1946-10-02
Any idea to help me?
Thank you so much!
Julian

Re: Hive on TEZ + LLAP

2016-07-15 Thread Andrew Sears
HDP 2.5 includes LLAP.

Cheers,
Andrew

On Fri, Jul 15, 2016 at 11:36 AM, Jörn Franke < jornfra...@gmail.com 
[jornfra...@gmail.com] > wrote:
I would recommend a distribution such as Hortonworks were everything is already 
configured. As far as I know llap is currently not part of any distribution.
On 15 Jul 2016, at 17:04, Ashok Kumar < ashok34...@yahoo.com 
[ashok34...@yahoo.com] > wrote:

Hi,
Has anyone managed to make Hive work with Tez + LLAP as the query engine in 
place of Map-reduce please?
If you configured it yourself which version of Tez and LLAP work with Hive 2. 
Do I need to build Tez from source for example
Thanks

Re: Best Hive Authorization Model for Shared data

2016-04-12 Thread Andrew Sears

Hi there,

Depending on your distribution you may need to look at tools like Ranger or 
Sentry, which should extend the model to meet your needs.


Regards,
Andrew


On Tue, Apr 12, 2016 at 6:42 PM, Udit Mehta < ume...@groupon.com 
[ume...@groupon.com] > wrote:

Hi all,

I wanted to understand what authorization model is most suitable for a 
production environment where most of the data is shared between multiple 
teams and users.
I know this is would depend more on the use case but I cant seem to figure 
out the best model for our use:


We have data that is owned by a certain process (R/W access for that user) 
while other users only have Read access to that data. We have a lot of 
instances when users would want to create external tables pointing to this 
data. We tried the following 3 auth models:


1. Default Authorization model : This we think is less secure and any user 
can grant himself access to create/modify tables and databases even where 
they are not supposed to. We would want to have much tighter security than 
this model provides.


2. Storage Based Authorization : While this helps us by preventing users 
from modifying metadata by checking the HDFS permissions of the underlying 
directories, it prevents our most important use case of letting users 
create external tables on data they dont have write access to. I would 
assume external tables wont actually delete the data when dropping 
tables/partitions so this operation should be allowed. But because it is 
not, even this authorization model does not meet our use case.


3. Sql Standard Based Authorization: This does give us fine-grained control 
over which users can perform specific commands, but when it comes to 
creating external tables, even this authorization scheme seems to use the 
filesystem's permissions.


So overall all 3 models didnt seem to fulfill our requirement here which I 
think would be a fairly common one. I want to know how other users manage 
security on Hive or If i am missing something.


Thanks in advance,
Udit

Re: Best way of Unpivoting of hiva table data. Any Analytic function for unpivoting

2016-03-30 Thread Andrew Sears

From mytable
Select id, 'mycol' as name, col1 as value
Union
Select id, 'mycol2' as name, col2 as value

Something like this might work for you?

Cheers,
Andrew


On Mon, Mar 28, 2016 at 7:53 PM, Ryan Harris < ryan.har...@zionsbancorp.com 
[ryan.har...@zionsbancorp.com] > wrote:
collect_list(col) will give you an array with all of the data from that 
column

However, the scalability of this approach will have limits.

-Original Message-
From: mahender bigdata [mailto:mahender.bigd...@outlook.com]
Sent: Monday, March 28, 2016 5:47 PM
To: user@hive.apache.org
Subject: Best way of Unpivoting of hiva table data. Any Analytic function 
for unpivoting


Hi,

Has any one implemented Unpivoting of Hive external table data. We would
like Convert Columns into Multiple Rows. We have external table, which
holds almost 2 GB of Data. is there best and quicker way of Converting
columns into Row. Any Analytic functions available in Hive to do 
Unpivoting.




==
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS 
CONFIDENTIAL and may contain information that is privileged and exempt from 
disclosure under applicable law. If you are neither the intended recipient 
nor responsible for delivering the message to the intended recipient, 
please note that any dissemination, distribution, copying or the taking of 
any action in reliance upon the message is strictly prohibited. If you have 
received this communication in error, please notify the sender immediately. 
Thank you.

Re: Automatic Update statistics on ORC tables in Hive

2016-03-28 Thread Andrew Sears
It would be useful to have a script that could be scheduled as part of a low 
priority background job, to update stats at least where none are available, and 
a report in the Hive GUI on stats per table.


Encountered a Tez oo memory issue due to the lack of auto updated stats 
recently.
Cheers, Andrew

On Mon, Mar 28, 2016 at 2:27 PM, Mich Talebzadeh < mich.talebza...@gmail.com 
[mich.talebza...@gmail.com] > wrote:
Hi Alan,
Thanks for the clarification. I gather you are referring to the following notes 
in Jira
"Given the work that's going on in HIVE-11160 
[https://issues.apache.org/jira/browse/HIVE-11160] and HIVE-12763 
[https://issues.apache.org/jira/browse/HIVE-12763] I don't think it makes sense 
to continue down this path. These JIRAs will lay the groundwork for 
auto-gathering stats on data as it is inserted rather than having a background 
process do the work."
I concur that I am not a fan of automatic update statistics although many RDBMS 
vendor were touting about it in earlier days. The whole thing turned up to be a 
hindrance as UPDATE STATISTICS was being fired in the middle of the business 
day thus adding issues to the workload by taking resources away.
Most vendors base the need for update/gathering stats on the number of rows 
being changed by relying on some Function say datachange(). When datachange() 
function indicates changes by 10% so it is time for update stats to run. Again 
in my opinion rather arbitrary and void of any scientific base. For Hive the 
important one is Inserts. For transactional tables one will have Updates and 
Deletes as well. My understanding is that the classical approach is to report 
on how many "row change operations" say Inserts have been performed since the 
last time any kind of analyze statistics was run.

This came to my mind as I was using Spark to load CSV files and create and 
insert in Hive ORC tables. The problem I have is that Analyse statistics 
through Spark fails. This is not a show stopper as the load shell script 
invokes beeline to log in to Hive and Analyze statistics on the newly created 
table. Although some proponents might argue about saving data in Spark as 
Parquet file, when one has millions and millions of rows then stats matter and 
then ORC adds its value.




Cheers


Dr Mich Talebzadeh



LinkedIn 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 
[https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw]



http://talebzadehmich.wordpress.com [http://talebzadehmich.wordpress.com/]




On 28 March 2016 at 18:43, Alan Gates < alanfga...@gmail.com 
[alanfga...@gmail.com] > wrote:
I resolved that as Won’t Fix. See the last comment on the JIRA for my rationale.

Alan.

> On Mar 28, 2016, at 03:53, Mich Talebzadeh < mich.talebza...@gmail.com 
> [mich.talebza...@gmail.com] > wrote:
>
> Thanks. This does not seem to be implemented although the Jira says resolved. 
> It also mentions the timestamp of the last update stats. I do not see it yet.
>
> Regards,
>
> Mich
>
> Dr Mich Talebzadeh
>
> LinkedIn 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> [https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw]
>
> http://talebzadehmich.wordpress.com [http://talebzadehmich.wordpress.com]
>
>
> On 28 March 2016 at 06:19, Gopal Vijayaraghavan < gop...@apache.org 
> [gop...@apache.org] > wrote:
>
> > This might be a bit far fetched but is there any plan for background
> >ANALYZE STATISTICS to be performed on ORC tables
>
>
> https://issues.apache.org/jira/browse/HIVE-12669 
> [https://issues.apache.org/jira/browse/HIVE-12669]
>
> Cheers,
> Gopal
>
>
>

Re: read-only mode for hive

2016-03-09 Thread Andrew Sears
Another option might be to lock using a zookeeper script.

Andrew



Sent using CloudMagic Email
[https://cloudmagic.com/k/d/mailapp?ct=pa=7.4.10=5.1.1=email_footer_2]
 On Wed, Mar 09, 2016 at 7:05 PM, Andrew Sears < 
andrew.se...@analyticsdream.com [andrew.se...@analyticsdream.com] > wrote:
What about renaming the table? To another schema with limited rights? Not sure
why just flipping access grant to select only wouldn't also work, provided auth
is enabled and not external.

An hdfs snapshot could also give you point -in-time copy. Set acls to restrict
access if enabled.


Cheers,
Andrew


On Wed, Mar 09, 2016 at 1:15 PM, PG User < pguser1...@gmail.com 
[pguser1...@gmail.com] > wrote:
Thank you all for replies.
My usecase is as follows: I want to put a table (or database) in read-only 
mode. Then do some operations
such as taking table definition and hdfs snapshot. I want to put table in read
only mode to maintain consistency. After all my operations are done, I will
again put hive to read-write mode.
Sentry may not be solution as it will not handle existing transactions. 
creating view will not solve the purpose either if inserts are going on.
- Nachiket


On Wed, Mar 9, 2016 at 7:20 AM, David Capwell < dcapw...@gmail.com 
[dcapw...@gmail.com] > wrote:
Could always set the tables output format to be the null output format

On Mar 8, 2016 11:01 PM, "Jörn Franke" < jornfra...@gmail.com 
[jornfra...@gmail.com] > wrote:
What is the use case? You can try security solutions such as Ranger or Sentry.

As already mentioned another alternative could be a view.

> On 08 Mar 2016, at 21:09, PG User < pguser1...@gmail.com 
> [pguser1...@gmail.com] > wrote:
>
> Hi All,
> I have one question about putting hive in read-only mode.
>
> What are the ways of putting hive in read-only mode?
> Can I take a lock at database level to serve purpose? What will happen to
existing transaction? My guess is it will not grant a lock until all
transactions are complete.
>
> I read to change owner ship of /user/hive/warehouse/, but it is not full proof
solution.
>
> Thank you.
>
> - PG User