from:"Thejas Nair"

Re: HIVE-20420 Provide a fallback authorizer when no other authorizer is in use

2018-11-14 Thread Thejas Nair

This was for CVE-2018-11777.
You can find more details in description of CVE-2018-11777

On Wed, Nov 14, 2018 at 3:40 AM Oleksiy S  wrote:
>
> Guys, could you help with this new feature? HIVE-20420
>
> I see no docs, no use cases, just nothing. Thanks.
>
> --
> Oleksiy

Re: [SECURITY] CVE-2018-1314: Hive explain query not being authorized

2018-11-09 Thread Thejas Nair

Terry, Yes this is seen with SQL stardard authorization, Ranger and I
suppose Sentry based authorization as well.
Hive was not passing the table objects to the authorization plugin
implementations during authorization api calls.

On Wed, Nov 7, 2018 at 1:49 PM Terry  wrote:
>
> Daniel - Is this happening when beeline security is enabled? Can you provide 
> a link for more info on this?
>
> On Wed, Nov 7, 2018 at 14:25 Daniel Dai  wrote:
>>
>> CVE-2018-1314: Hive explain query not being authorized
>>
>> Severity: Important
>>
>> Vendor: The Apache Software Foundation
>>
>> Versions Affected: This vulnerability affects all versions of Hive,
>> including 2.3.3, 3.1.0 and earlier
>>
>> Description: Hive "EXPLAIN" operation does not check for necessary
>> authorization of involved entities in a query. An unauthorized user
>> can do "EXPLAIN" on arbitrary table or view and expose table metadata
>> and statistics.
>>
>> Mitigation: all Hive users shall upgrade to 2.3.4 or 3.1.1 or later

Re: Proposal: Apply SQL based authorization functions in the metastore.

2018-04-28 Thread Thejas Nair

Hi Elliot,

One scenario where Storage based authorization doesn't work is the
case of object stores such as S3. In those scenarios, the
tool/platform that is accessing the data won't have any restrictions
on data access either. I am not sure how the data access would be
secured in such cases, even if metastore access is controlled.

Overall, the metastore api is a much lower level API, and as a result
it is difficult to enforce higher level restrictions at that level.
(More on that below).

I agree that O/JDBC via HS2 is not something distributed tools can use
(at least with standard API).
I think the ideal way to enforce security is having tools/platforms
read via a 'table server' (and not give them direct file system
access).
At Hortonworks, we have been using this to provide security for Spark,
by allowing it to read in parallel from LLAP deamons -
https://www.slideshare.net/Hadoop_Summit/security-updates-more-seamless-access-controls-with-apache-spark-and-apache-ranger
https://github.com/hortonworks-spark/spark-llap/wiki/1.-Goal-and-features
(You can replace Ranger with SQL auth as well in above examples).

The next phase of that work would likely make use Apache Arrow for the
data exchange (there are some hive jiras created recently around it).

I had considered having the authorization at metastore level, but
realized that is not the right place to enforce the RDBMS/SQL style
policies. Here are some notes I wrote while back about it -
http://hadoop-pig-hive-thejas.blogspot.com/2014/03/hive-sql-standard-authorization-why-not.html

Quoting from there -
The advantage of doing it at the metastore api level would have been
that pig and MR would also be covered under this authorization model.
But this works only if the SQL actions always needs some metastore api
calls, and access control on these calls it needs to make can be used
to enforce the SQL level authorization.

Take for example INSERT privilege in SQL, you can grant INSERT without
granting SELECT privilege. But when processing insert queries for the
user, we need to be able to do a getTable() and read the schema of the
table. But if you look at it from metastore api perspective, you
should not be able to do a getTable() without having SELECT privileges
on the table.
Similar issues happen with DELETE and UPDATE privileges, which you can
grant without SELECT.

Another example is URIs in the SQL statement, you don't need to make
any metatore api calls before access URIs. So URI access control can't
be implemented using metastore api calls.

Another use case is anything that you want to allow the ADMIN to do
but the action does not involve specific metastore api calls that can
be used to control the action.

Thanks,
Thejas

On Fri, Apr 20, 2018 at 6:30 AM, Elliot West  wrote:
> Hello,
>
> I’d like to propose that SQL based authorization (or something similar) be
> applied and enforced also in the metastore service as part of the initiative
> to extract HMS as an independent project. While any such implementation
> cannot be ’system complete’ like HiveServer2 (HS2) (HMS has no scope to
> intercept operations applied to table data, only metadata), it would be a
> significant step forward for controlling the operations that can be actioned
> by the many non-HS2 clients in the Hive ecosystem.
>
> I believe this is a good time to consider this option as there is currently
> much discussion in the Hive community on the future directions of HMS and
> greater recognition that HMS is now seen as general data platform
> infrastructure and not simply an internal Hive component.
>
> Further details are below. I’d be grateful for any feedback, thoughts, and
> suggestions on how this could move forward.
>
> Problem
> At this time, Hive’s SQL based authorization feature is the recommended
> approach for controlling which operations may be performed on what by whom.
> This feature is applied in the HS2 component. However, a large number of
> platforms that integrate with Hive do not do so via HS2, instead talking to
> the metastore service directly and so bypassing authorization. They can
> perform destructive operations such as a table drop even though the
> permissions declared in the metastore may explicitly forbid it as they are
> able to circumvent the authorization logic in HS2.
>
> In short, there seems to be a lack of encapsulation with authorization in
> the metastore; HMS owns the metadata, is responsible for performing actions
> on metadata, for maintaining permissions on what actions are permissible by
> whom, and yet has no means to use the information it has to protect the data
> it owns.
>
> Workarounds
> Common workarounds to this deficiency include falling back to storage based
> authorization or running read only metastore instances. However, both of
> these approaches have significant drawbacks:
>
> File based auth does not function when using object stores such as S3 and so
> is not usable in cloud deployments of Hive - a

Re: CVE-2016-3083: Apache Hive SSL vulnerability bug disclosure

2017-05-30 Thread Thejas Nair

It went in under guise of the jira - HIVE-13390.
Commit -
https://github.com/apache/hive/commit/3b2ea248078bdf3a8372958cf51a989dc3883bcc

On Tue, May 30, 2017 at 12:35 PM, Ying Chen  wrote:

> Hello -
> Was there a particular JIRA(s) that went into Hive 1.2.2 that fixed this
> issue?
> Thanks much.
> Ying
>
>
> On Wed, May 24, 2017 at 3:56 PM, Vaibhav Gumashta <
> vgumas...@hortonworks.com> wrote:
>
>> Severity: Important
>>
>> Vendor: The Apache Software Foundation
>>
>> Versions Affected:
>> Apache Hive 0.13.x
>> Apache Hive 0.14.x
>> Apache Hive 1.0.0 - 1.0.1
>> Apache Hive 1.1.0 - 1.1.1
>> Apache Hive 1.2.0 - 1.2.1
>> Apache Hive 2.0.0
>>
>> Description:
>>
>> Apache Hive (JDBC + HiveServer2) implements SSL for plain TCP and HTTP
>> connections (it supports both transport modes). While validating the
>> server’s certificate during the connection setup, the client doesn’t seem
>> to be verifying the common name attribute of the certificate. In this way,
>> if a JDBC client sends an SSL request to server abc.com, and the server
>> responds with a valid certificate (certified by CA) but issued to xyz.com,
>> the client will accept that as a valid certificate and the SSL handshake
>> will go through.
>>
>> Mitigation:
>>
>> Upgrade to Apache Hive 1.2.2 for 1.x release line, or to Apache Hive
>> 2.0.1 or later for 2.0.x release line, or to Apache Hive 2.1.0 and later
>> for 2.1.x release line.
>>
>> Credit: This issue was discovered by Branden Crawford from Inteco Systems
>> Limited (inetco.com).
>>
>
>

Re: SQL Standard Based Hive Authorization with CDH 5.X

2017-05-11 Thread Thejas Nair

You can also set them via hiveserver2-site.xml instead of passing them as
commandline params.
Let me make that more clear in the doc.

On Thu, May 11, 2017 at 9:36 AM, Rob Anderson 
wrote:

> You add the options to HiveServer2 Environment Advanced Configuration
> Snippet (Safety Valve) via:
>
> HIVE_OPTS=--hiveconf hive.security.authorization.
> manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
> --hiveconf hive.security.authorization.enabled=true --hiveconf
> hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
> --hiveconf hive.metastore.uris='thrift://XX:9083'
>
> Works fine.
>
> Rob
>
> On Tue, May 9, 2017 at 3:59 PM, Rob Anderson 
> wrote:
>
>> Has anyone implemented SQL Standard Based Hive Authorization with CDH
>> 5.5.2 (hive1.1.0)?
>>
>> Cloudera has confirmed that it's not supported, but I have a need that
>> requires the implementation.
>>
>> I've followed: https://cwiki.apache.org/confl
>> uence/display/Hive/SQL+Standard+Based+Hive+Authorization
>>
>> I've added the following to "HiveServer2 Advanced Configuration Snippet
>> (Safety Valve) for hive-site.xml" via Cloudera Manager.
>>
>> 
>>
>> hive.server2.enable.doAs
>>
>> false
>>
>> 
>>
>> 
>>
>> hive.users.in.admin.role
>>
>> oozie_runtime,hive,randerson
>>
>> 
>>
>> 
>>
>> hive.security.metastore.authorization.manager
>>
>> org.apache.hadoop.hive.ql.security.authorization.Meta
>> StoreAuthzAPIAuthorizerEmbedOnly
>>
>> 
>>
>> 
>>
>> hive.security.authorization.manager
>>
>> org.apache.hadoop.hive.ql.security.authorization.plug
>> in.sqlstd.SQLStdConfOnlyAuthorizerFactory
>>
>> 
>>
>> 
>>
>> hive.security.authorization.task.factory
>>
>> org.apache.hadoop.hive.ql.parse.authorization.HiveAut
>> horizationTaskFactoryImpl
>>
>> 
>>
>>
>> I've tried adding the following start up options to "HiveServer2
>> Environment Advanced Configuration Snippet (Safety Valve)" via Cloudera
>> Manager.
>>
>>- -hiveconf hive.security.authorization.ma
>>nager=org.apache.hadoop.hive.ql.security.authorization.plugi
>>n.sqlstd.SQLStdHiveAuthorizerFactory
>>
>>
>>- -hiveconf hive.security.authorization.enabled=true
>>- -hiveconf hive.security.authenticator.ma
>>nager=org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
>>- -hiveconf hive.metastore.uris=' '
>>
>>
>> I get the following error:
>>
>> Could not parse: HiveServer2 Environment Advanced Configuration Snippet
>> (Safety Valve) : Could not parse parameter 'hive_hs2_env_safety_valve'.
>> Was expecting: valid variable name. Input: -hiveconf hive.
>> security.authorization.manager=org.apache.hadoop.hive.q
>> l.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory -hiveconf
>> hive.security.authorization.enabled=true -hiveconf hive.
>> security.authenticator.manager=org.apache.hadoop.hive.q
>> l.security.SessionStateUserAuthenticator -hiveconf hive.metastore.uris='
>> '
>>
>> So, in short - I'm not sure how to start hiveserver2 with those options.
>> Any help you can offer is appreciated.
>>
>> Thanks,
>>
>> Rob
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: why need set hive.server2.enable.doAs=false in SQL-Standard Based Authorization

2016-08-19 Thread Thejas Nair

1 - if it set to true, you need to manager permissions in two places for
users, using grant/revoke on tables, and file system permissions as well,
and keep them in sync. That will be a headache.
Moreover, the main intent for sql std auth is to be able to provide fine
grained access control using views (access to only certain columns/rows).
To allow users to change file system permissions, you need to allow them
access to file system, which means you can't do fine grained access control.

2.  The principal specified in the connect string is to indicate what
service principal is, it is not the principal of the user who is
connecting. You can kinit as any user.
doas setting does not affect authentication.

3. The grant/revoke not having any privilege requirements was an issue in
the old default legacy auth. It is not an issue in SQL std auth.
hive.users.in.admin.role is used to set the list of admin users.

4. You can use SQL auth with storage based if you have certain users who
access metastore without going through HS2, for example hive cli users.

On Thu, Aug 18, 2016 at 12:41 AM, Maria  wrote:

>
> Hi,all:
>   I have a few questions about hive authentication and authorization:
>
> (1)why do we need to set hive.server2.enable.doAs=false in SQL-Standard
> Based Authorization ?
>
> (2)when set hive.server2.enable.doAs=false in SQL-Standard Based
> Authorization,the beeline way to connecte HS2,
> the queries are run as the service user id of HiverServer2, how to make it
> use the users who is in current kerberos ticket cache?
> （because if "hive.server2.enable.doAs=false" and hive uri is like
> this——"jdbc:hive2://cdh1:1/default;principal=hive/c...@javachen.com"，
> the kerberos ticket cache will not work.）
>
> (3)Does hive 1.2.1 and later version still has grant/revoke BUG?——I found
> someone said
> that user needs to imply administrator privilege according to implements
> AbstractSemanticAnalyzerHook,if
> he want to let the administrator own the grant/revoke privilege only. But
> I also found a parameter
> "hive.users.in.admin.role",does this param makes up this deficiency？
>
> (4)Must I start up hive metastore service when SQL Standards Based Hive
> Authorization in conjunction
> with storage based authorization？（ https://cwiki.apache.org/
> confluence/display/Hive/SQL+Standard+Based+Hive+Authorization），and
> if the two combined, “hive.server2.enable.doAs" set to false?
>
> (5)Can someone please give me a tip on this class:
> BitSetCheckAuthorizationProvider? if I can
> set "hive.security.authorization.manager=org.apache.hadoop.
> hive.ql.security.authorization.BitSetCheckAuthorizationProvider"?What
> are the difference between BitSetCheckAuthorizationProvider and
> SQLStdHiveAuthorizerFactory?
>
>
> I am confused by these questions for a long time. I am eager to get your
> guidance.
>
> Any reply will be much appreciated.
> And thankyou again.
>
>
>
>

Re: javax.jdo.JDODataStoreException: Required table missing : "VERSION" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrec

2016-08-19 Thread Thejas Nair

As the error message indicated, please use "schematool -initSchema
-dbType.. " command to create the proper metadata schema in metastore.


On Fri, Aug 19, 2016 at 4:53 AM, شجاع الرحمن بیگ 
wrote:

> Hey,
>
> Could you please help me resolving this error?
> version= hive 2.1.0
>
> ERROR StatusLogger No log4j2 configuration file found. Using default
> configuration: logging only errors to the console.
> 13:41:37.848 [main] ERROR hive.ql.metadata.Hive - Cannot initialize
> metastore due to autoCreate error
> javax.jdo.JDODataStoreException: Required table missing : "VERSION" in
> Catalog "" Schema "". DataNucleus requires this table to perform its
> persistence operations. Either your MetaData is incorrect, or you need to
> enable "datanucleus.schema.autoCreateTables"
> at org.datanucleus.api.jdo.NucleusJDOHelper.
> getJDOExceptionForNucleusException(NucleusJDOHelper.java:553)
> ~[datanucleus-api-jdo-4.2.1.jar:?]
> at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(
> JDOPersistenceManager.java:720) ~[datanucleus-api-jdo-4.2.1.jar:?]
> at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(
> JDOPersistenceManager.java:740) ~[datanucleus-api-jdo-4.2.1.jar:?]
> at org.apache.hadoop.hive.metastore.ObjectStore.
> setMetaStoreSchemaVersion(ObjectStore.java:7763)
> ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.ObjectStore.
> checkSchema(ObjectStore.java:7657) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.ObjectStore.
> verifySchema(ObjectStore.java:7632) ~[hive-exec-2.1.0.jar:2.1.0]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[?:1.7.0_79]
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57) ~[?:1.7.0_79]
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_79]
> at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_79]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
> ~[hive-exec-2.1.0.jar:2.1.0]
> at com.sun.proxy.$Proxy11.verifySchema(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:547)
> ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.HiveMetaStore$
> HMSHandler.createDefaultDB(HiveMetaStore.java:612)
> ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.HiveMetaStore$
> HMSHandler.init(HiveMetaStore.java:398) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<
> init>(RetryingHMSHandler.java:78) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.RetryingHMSHandler.
> getProxy(RetryingHMSHandler.java:84) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.HiveMetaStore.
> newRetryingHMSHandler(HiveMetaStore.java:6396)
> ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.
> (HiveMetaStoreClient.java:236) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<
> init>(SessionHiveMetaStoreClient.java:70) ~[hive-exec-2.1.0.jar:2.1.0]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) ~[?:1.7.0_79]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57) ~[?:1.7.0_79]
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45) ~[?:1.7.0_79]
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> ~[?:1.7.0_79]
> at org.apache.hadoop.hive.metastore.MetaStoreUtils.
> newInstance(MetaStoreUtils.java:1625) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.
> (RetryingMetaStoreClient.java:80) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(
> RetryingMetaStoreClient.java:130) ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(
> RetryingMetaStoreClient.java:101) ~[hive-exec-2.1.0.jar:2.1.0]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3317)
> ~[hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3356)
> [hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336)
> [hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590)
> [hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:236)
> [hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.ql.metadata.Hive.
> registerAllFunctionsOnce(Hive.java:221) [hive-exec-2.1.0.jar:2.1.0]
> at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:366)
> [hive-exec-2.1.0.jar:2.1.0]
> at

Re: [ANNOUNCE] Apache Hive 2.1.0 Released

2016-06-21 Thread Thejas Nair

Thanks for your hard work and patience in driving the release Jesus! :)


On Tue, Jun 21, 2016 at 10:18 AM, Jesus Camachorodriguez
 wrote:
> The Apache Hive team is proud to announce the release of Apache Hive
> version 2.1.0.
>
> The Apache Hive (TM) data warehouse software facilitates querying and
> managing large datasets residing in distributed storage. Built on top
> of Apache Hadoop (TM), it provides, among others:
>
> * Tools to enable easy data extract/transform/load (ETL)
>
> * A mechanism to impose structure on a variety of data formats
>
> * Access to files stored either directly in Apache HDFS (TM) or in other
>   data storage systems such as Apache HBase (TM)
>
> * Query execution via Apache Hadoop MapReduce and Apache Tez frameworks.
>
> For Hive release details and downloads, please visit:
> https://hive.apache.org/downloads.html
>
> Hive 2.1.0 Release Notes are available here:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843=12334255
>
> We would like to thank the many contributors who made this release
> possible.
>
> Regards,
>
> The Apache Hive Team
>
>
>

Re: Hive 2 database Entity-Relationship Diagram

2016-05-19 Thread Thejas Nair

The pdf is not upside for me, in chrome browser.

However, there seems to be many tables that are not related to (or rather used 
by) hive, specifically DMRV_* and DMRS_* ones.

Thanks,

Thejas

From: Mich Talebzadeh 
Sent: Thursday, May 19, 2016 10:53 AM
To: user
Cc: user @spark
Subject: Re: Hive 2 database Entity-Relationship Diagram

thanks Dudu for your comments

I will check.

I will realign the overlapping tables

Only partial tables have relationship not all I am afraid. Most DMRS_% tables 
are standalone.

I can see the PDF as it is can you kindly check the top left hand corner. The 
one below

Do you see this upside down

Thanks[Inline images 1]

Dr Mich Talebzadeh

LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

http://talebzadehmich.wordpress.com

On 19 May 2016 at 18:44, Markovitz, Dudu 
> wrote:
Thanks Mich

I'm afraid the current format is not completely user friendly.
I would suggest to divide the tables to multiple sets by subjects / graph 
connectivity (BTW, it seems odd that most of the tables are disconnected)

Also -

* HIVEUSER.PARTITION_KEY_VALS is partially covering another table

* The PDF is upside-down

Dudu

From: Mich Talebzadeh 
[mailto:mich.talebza...@gmail.com]
Sent: Thursday, May 19, 2016 8:04 PM
To: user >; user @spark 
>
Subject: Re: Hive 2 database Entity-Relationship Diagram

Attachement

Dr Mich Talebzadeh

LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

http://talebzadehmich.wordpress.com

On 19 May 2016 at 18:02, Mich Talebzadeh 
> wrote:
Hi All,

I use Hive 2 with metastore created for Oracle Database with 
hive-txn-schema-2.0.0.oracle.sql.

It already includes concurrency stuff added into metastore

The RDBMS is Oracle Database 12c Enterprise Edition Release 12.1.0.2.0.

 I created an Entity-Relationship (ER) diagram from the physical model. There 
are 194 tables, 127 views and 38 relationships. The relationship notation is 
Bachman

Fairly big diagram in PDF format. However, you can zoom into it.

Please have a kook and appreciate comments to me and if it is useful we can 
load it into wiki.

HTH

Dr Mich Talebzadeh

LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

http://talebzadehmich.wordpress.com

Re: use jdbc connect to hive2.0

2016-05-04 Thread Thejas Nair

The property should be named hadoop.proxyuser.dcos.groups and
hadoop.proxyuser.dcos.groups if you are running HS2 as user dcos.
Also, user dcos  should belong to the group dcos if that setting is to
work. I assume you are trying to have hive.server2.enable.doAs=true.

I am not sure why you are seeing different behavior with hive 1.2.1 .

On Wed, May 4, 2016 at 12:27 AM, kevin  wrote:
> I found hadoop info is :2016-05-04 13:49:33,785 INFO
> org.apache.hadoop.ipc.Server: Socket Reader #1 for port 9000:
> readAndProcess from client 10.1.3.120 threw exception
> [org.apache.hadoop.security.authorize.AuthorizationException: User: dcos is
> not allowed to impersonate dcos]
>
> and when I trun to hive1.2.1 the problem is gone.  is hive2 not match
> hadoop2.7.1?
>
> 2016-05-04 11:29 GMT+08:00 kevin :
>
>> hi all;
>> I use hive-2.0.0 with hadoop2.7.1 when I connect to the hiveserver2 I got
>> msg : Failed to open new session: java.lang.RuntimeException:
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
>> User: dcos is not allowed to impersonate dcos
>>
>>  dcos is the user i use start hiverserver2.
>>
>> also I try to config hadoop core-site.xml whih :but not help
>>
>> 
>>   hadoop.proxyuser.hadoop.groups
>>   dcos
>>   Allow the superuser oozie to impersonate any members of
>> the group group1 and group2
>>  
>>
>>  
>>   hadoop.proxyuser.hadoop.hosts
>>   localhost
>>   The superuser can connect only from host1 and host2 to
>> impersonate a user
>>   
>>
>>
>>

Re: [VOTE] Bylaws change to allow some commits without review

2016-04-18 Thread Thejas Nair

?+1

From: Wei Zheng 
Sent: Monday, April 18, 2016 10:51 AM
To: user@hive.apache.org
Subject: Re: [VOTE] Bylaws change to allow some commits without review

+1

Thanks,
Wei

From: Siddharth Seth >
Reply-To: "user@hive.apache.org" 
>
Date: Monday, April 18, 2016 at 10:29
To: "user@hive.apache.org" 
>
Subject: Re: [VOTE] Bylaws change to allow some commits without review

+1

On Wed, Apr 13, 2016 at 3:58 PM, Lars Francke 
> wrote:
Hi everyone,

we had a discussion on the dev@ list about allowing some forms of contributions 
to be committed without a review.

The exact sentence I propose to add is: "Minor issues (e.g. typos, code style 
issues, JavaDoc changes. At committer's discretion) can be committed after 
soliciting feedback/review on the mailing list and not receiving feedback 
within 2 days."

The proposed bylaws can also be seen here 

This vote requires a 2/3 majority of all Active PMC members so I'd love to get 
as many votes as possible. The vote will run for at least six days.

Thanks,
Lars

Re: Issue with Star schema

2016-03-15 Thread Thejas Nair

As suggested, looking at the explain plan should tell you if map-join
is getting used.
Using a recent version with hive-on-tez would also give you further
speedup as map-joins are optimized further in it.


On Tue, Mar 15, 2016 at 9:32 AM, sreebalineni .  wrote:
> You can think of map joins.If cluster is configured by default it must be
> happening already check query profile
>
> On Tue, 15 Mar 2016 21:12 Himabindu sanka, 
> wrote:
>
>> Hi Team,
>>
>>
>>
>> I have a query where I am joining with 10 other entities
>>
>>
>>
>> Like
>>
>>
>>
>> Select  a.col1,b1.col1,b2.col1 from
>>
>>
>>
>> A a
>>
>> Left outer join b1 on
>>
>> Left outer join b2 on
>>
>> Left outer join b3….
>>
>>
>>
>>
>>
>> A is my driver entity which a 20 million data. Most of the other entities
>> are small just 10 to 20 rows of data.
>>
>>
>> In that scenario, my hive query is taking hours of time to join and fetch
>> it.  Please suggest me optimization technique.
>>
>> This is chocking the query performance.
>>
>>
>>
>>
>>
>> *Regards,*
>>
>> *Himabindu Sanka*
>>

Re: show databases doesn't return all databases with Kerberos/Sentry enabled

2015-05-13 Thread Thejas Nair

Hi Liping,
As Szehon said, the sentry mailing list is likely to be able to help
you with this.

Please note that the secur...@hive.apache.org is meant to be used to
report security vulnerabilities, it is not the right place for
questions on features .

Thanks,
Thejas


On Wed, May 13, 2015 at 12:05 PM, Szehon Ho sze...@cloudera.com wrote:
 Hi Liping

 Do you want to check the Sentry mailing list as well?  They might know more
 about this scenario.

 Thanks
 Szehon

 On Wed, May 13, 2015 at 10:18 AM, Liping Zhang zlpmiche...@gmail.com
 wrote:

 Dear all,

 I installed CDH, kerberos, sentry to enable security for hive beeline. I
 did following commands:

 # kinit -k -t hive.keytab
 hive/ip-172-31-9-84.us-west-2.compute.inter...@hadoop.com

 # beeline -u
 jdbc:hive2://ip-172-31-9-84.us-west-2.compute.internal:1/default;principal=hive/ip-172-31-9-84.us-west-2.compute.inte...@hadoop.com


 and in beeline CLI:
 # show databases
 # show tables
 these 2 show commands did work well and return all the databases and
 tables we had.

 However, after I changed some configuration and restart hive,  and rerun
 above commands with hive user, it was  strange that in beeline CLI, show
 databases and show tables didn't return all the databases and tables we
 had, instead, they only returned one default database, and no tables
 returned.

 The configuration change I remembered included:
 Adding:

 property
 namehive.server2.authentication/name
 valueKERBEROS/value
 /property
 To Hive Client Configuration Safety Valve for hive-site.xml (Gateway
 service in CM) properly modified the hive-site.xml for Hue Server.


 The commands in beeline I remembered after restarting hive, and before
 running show databases and show tables was:
 # create role role1;
 # show roles;
 # show current roles;

 after that, the hive user in beeline could only return default
 database with no tables for show databases and show tables command.


 And I checked /user/hive/warehouse dir, all the databases and tables files
 did existed.


 Did anyone met this kind of issue before? Any comments and discussion are
 highly appreciated!


 --
 Cheers,
 -
 Big Data - Big Wisdom - Big Value
 --
 Michelle Zhang (Liping Zhang)

Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Thejas Nair

Congrats!

On Wed, Jan 28, 2015 at 2:08 PM, Prasanth Jayachandran
pjayachand...@hortonworks.com wrote:
 Thanks Shezon and everyone! Congraturlations Vikram, Jason, Shezon and Owen!

 On Jan 28, 2015, at 1:51 PM, Szehon Ho 
 sze...@cloudera.commailto:sze...@cloudera.com wrote:

 Thanks and congrats to Vikram, Jason, Owen, and Prasanth !

 On Wed, Jan 28, 2015 at 1:28 PM, Hari Subramaniyan 
 hsubramani...@hortonworks.commailto:hsubramani...@hortonworks.com wrote:

 Congrats everyone!


 Thanks

 Hari

 
 From: cwsteinb...@gmail.commailto:cwsteinb...@gmail.com 
 cwsteinb...@gmail.commailto:cwsteinb...@gmail.com on behalf of Carl 
 Steinbach c...@apache.orgmailto:c...@apache.org
 Sent: Wednesday, January 28, 2015 1:15 PM
 To: d...@hive.apache.orgmailto:d...@hive.apache.org; 
 user@hive.apache.orgmailto:user@hive.apache.org
 Cc: sze...@apache.orgmailto:sze...@apache.org; 
 vik...@apache.orgmailto:vik...@apache.org; 
 jd...@apache.orgmailto:jd...@apache.org; Owen O'Malley; 
 prasan...@apache.orgmailto:prasan...@apache.org
 Subject: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason 
 Dere, Owen O'Malley and Prasanth Jayachandran

 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen 
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project 
 Management Committee. Please join me in congratulating the these new PMC 
 members!

 Thanks.

 - Carl

Re: [ANNOUNCE] New Hive PMC Member - Prasad Mujumdar

2014-12-09 Thread Thejas Nair

Congrats Prasad!


On Tue, Dec 9, 2014 at 5:03 PM, Navis류승우 navis@nexr.com wrote:
 Congratulations!

 2014-12-10 8:35 GMT+09:00 Jason Dere jd...@hortonworks.com:

 Congrats!

 On Dec 9, 2014, at 3:02 PM, Venkat V venka...@gmail.com wrote:

  Congrats Prasad!
 
  On Tue, Dec 9, 2014 at 2:32 PM, Brock Noland br...@cloudera.com wrote:
  Congratulations Prasad!!
 
  On Tue, Dec 9, 2014 at 2:17 PM, Carl Steinbach c...@apache.org wrote:
  I am pleased to announce that Prasad Mujumdar has been elected to the
  Hive Project Management Committee. Please join me in congratulating Prasad!
 
  Thanks.
 
  - Carl
 
 
 
 
  --
  Venkat V


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] Apache Hive 0.14.0 Released

2014-11-17 Thread Thejas Nair

The link to the download page is now -
https://hive.apache.org/downloads.html
(I have also corrected the email template in how-to-release wiki with new
url).

On Mon, Nov 17, 2014 at 1:59 PM, Roshan Naik ros...@hortonworks.com wrote:

1) fyi.. this link is broken:

http://hive.apache.org/releases.html

2) Java docs were not published for 0.14.0

https://hive.apache.org/javadoc.html

On Sun, Nov 16, 2014 at 7:04 PM, Clark Yang (杨卓荦) yangzhuo...@gmail.com
wrote:

Great job! Congrats!

Thanks,
Zhuoluo (Clark) Yang

2014-11-13 8:55 GMT+08:00 Gunther Hagleitner gunt...@apache.org:

The Apache Hive team is proud to announce the the release of Apache
Hive version 0.14.0.

The Apache Hive (TM) data warehouse software facilitates querying and
managing large datasets residing in distributed storage. Built on top
of Apache Hadoop (TM), it provides:

* Tools to enable easy data extract/transform/load (ETL).

* A mechanism to impose structure on a variety of data formats.

* Access to files stored either directly in Apache HDFS (TM) or in
other
data storage systems such as Apache HBase (TM) or Apache Accumulo (TM).

* Query execution via Apache Hadoop MapReduce and Apache Tez
frameworks.

* Cost-based query planning via Apache Calcite

For Hive release details and downloads, please visit:
http://hive.apache.org/releases.html

Hive 0.14.0 Release Notes are available here:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12326450styleName=TextprojectId=12310843

We would like to thank the many contributors who made this release
possible.

Regards,

The Apache Hive Team

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive PMC Member - Alan Gates

2014-10-28 Thread Thejas Nair

Congrats Alan!


On Mon, Oct 27, 2014 at 10:11 PM, Sergio Pena sergio.p...@cloudera.com wrote:
 Congratulations Alan!!

 On Mon, Oct 27, 2014 at 5:38 PM, Carl Steinbach c...@apache.org wrote:

 I am pleased to announce that Alan Gates has been elected to the Hive
 Project Management Committee. Please join me in congratulating Alan!

 Thanks.

 - Carl


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: hcatalog table permissions error

2014-10-28 Thread Thejas Nair

Nathan,
Can you check if you have hive.metastore.execute.setugi=true in conf
object used by HCatClient.create(conf) ?

Thanks,
Thejas

On Wed, Oct 22, 2014 at 12:55 PM, Nathan Bamford
nathan.bamf...@redpoint.net wrote:
 Whoops, I pasted the wrong line in for the hcat table records. Should have
 been:


 -rw-r--r--   3 nbamford hive110 2014-10-22 07:52
 /user/hive/warehouse/atest_hcat/part-m-355114470

 
 From: Nathan Bamford nathan.bamf...@redpoint.net
 Sent: Wednesday, October 22, 2014 12:50 PM
 To: user@hive.apache.org
 Subject: hcatalog table permissions error


 Hello,

   I've been puzzling away at a permissions issue I get through the hcatalog
 interface for a while now, and I think perhaps I've found a bug.

   When I create a table via the hive cli as user nbamford the directory
 created in hdfs has the owner I expect, nbamford:


 drwxrwxrwt   - nbamford hive  0 2014-10-22 07:18
 /user/hive/warehouse/atest_cli


   When I do the same thing in java using the HCatClient interface (note I am
 the os user nbamford in this case, just as when I run the hive cli):


 HCatClient client = HCatClient.create(conf);
 ListHCatFieldSchema schema = new ArrayListHCatFieldSchema();
 schema.add(new HCatFieldSchema(id,
 org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory.intTypeInfo, ));
 schema.add(new HCatFieldSchema(value,
 org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory.stringTypeInfo, ));
 HCatCreateTableDesc.Builder desc = HCatCreateTableDesc.create(default,
 atest_hcat, schema);
 client.createTable(desc.build());

 I find the directory is owned by user hive:

 drwxrwxrwt   - hive hive  0 2014-10-22 07:52
 /user/hive/warehouse/atest_hcat

 However, if I write tables to the table, via the HCatWriter interface, they
 are owned by nbamford:

 drwxrwxrwt   - hive hive  0 2014-10-22 07:52
 /user/hive/warehouse/atest_hcat

 Is this a bug in the hcatalog api, or is there a way to set the user name I
 have missed?





-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: hive 0.13.0 guava issues

2014-09-30 Thread Thejas Nair

Regarding the rejects by the mailing list, try sending emails as plain
text (not html), and see if that helps.

Please reply to that thread on the hadoop mailing list about guava 11,
more feedback there will help.

I am not sure why guava moved into hive-exec fat jar in 0.13 . Feel
free to open a jira about that.

On Tue, Sep 30, 2014 at 11:19 AM, Viral Bajaria viral.baja...@gmail.com wrote:
(Take 3... for some reason my reply emails are getting rejected by apache
mailing list)

2 things here:

I looked at the discussion, and the concern there is more about breaking
user code that assumes that guava 11.0 will be available via Hadoop v/s
anything breaking in Hadoop. I think that's a little flawed argument and
everyone has been hacking around to use the latest guava by doing CLASSPATH
hacks.

The bigger issue is hive-exec in 0.13 packaging guava 11 as a fat-jar v/s
having it as a library dependency that can be easily removed if needed.

Previously I had a hack to let the ClassLoader load guava that my user code
cares about the most i.e. the latest one.

Never seen Hadoop break because of that.

Thanks,
Viral

On Mon, Sep 29, 2014 at 11:45 AM, Thejas Nair the...@hortonworks.com
wrote:

guava jar is there as part of hadoop, and hadoop uses guava 0.11 jar.
As the guava versions are not fully compatible with each other, hive
can't upgrade unless hadoop also upgrades or we use a good way to
isolate the jar usage.
See discussion in hadoop about upgrading the guava version -

http://search-hadoop.com/m/LgpTk2MpkYf1/guava+stevesubj=Time+to+address+the+Guava+version+problem

On Fri, Sep 26, 2014 at 4:00 PM, Viral Bajaria viral.baja...@gmail.com
wrote:
Hi,

We just upgraded from hive 0.11.0 to 0.13.0 (finally!!)

So I noticed that for hive-exec jar, guava is packaged in the jar v/s
previously it wasn't.

Any reason it is package now ?

Secondly, is there anything that is stopping to bump the guava version
from
11.0 to the latest one ?

Given that 11.0 was released a few years ago, shouldn't we update that ?

Happy to create the JIRA and make that update if there is consensus on
that.

Thanks,
Viral

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity
to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified
that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender
immediately
and delete it from your system. Thank You.

Re: hive 0.13.0 guava issues

2014-09-29 Thread Thejas Nair

On Fri, Sep 26, 2014 at 4:00 PM, Viral Bajaria viral.baja...@gmail.com wrote:
Hi,

We just upgraded from hive 0.11.0 to 0.13.0 (finally!!)

So I noticed that for hive-exec jar, guava is packaged in the jar v/s
previously it wasn't.

Any reason it is package now ?

Secondly, is there anything that is stopping to bump the guava version from
11.0 to the latest one ?

Given that 11.0 was released a few years ago, shouldn't we update that ?

Happy to create the JIRA and make that update if there is consensus on that.

Thanks,
Viral

Re: Getting Started with Hive 0.13.1 ClassNotFound

2014-09-02 Thread Thejas Nair

hcat_server is essentially same has hive metastore, it is there for
historical reasons (time to remove it, in my opinion). As metastore
command is more commonly used, hcat_server command might have new bugs
that have gone unnoticed because of its disuse.


You can start hive-metastore by using 'hive --service metastore'


On Tue, Sep 2, 2014 at 11:21 AM, Geoffry Roberts threadedb...@gmail.com wrote:
 I'm sure this is a classpath issue but where?

 I am getting started with hive 0.13.1.  Hadoop 2.4.0 is up and running.  I
 installed hive from a tarball and followed the directions. But when it came
 time to start:

 $HIVE_HOME/hcatalog/sbin/hcat_server.sh start

 I get a CNFE: java.lang.ClassNotFoundException:
 org.apache.hadoop.hive.metastore.HiveMetaStore

 No doubt we have a class path issue here but how to fix?  The class in
 question is in $HIVE_HOME/lib/hive-metastore-0.13.1.jar

 I have HIVE_HOME and HIVE_CONF_DIR set up.

 --
 There are ways and there are ways,

 Geoffry Roberts

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Questions about hive authorization under hdfs permissions.

2014-06-16 Thread Thejas Nair

I hope you don't mind me cc'ing user-group so that this qa is
available for others as well.

The grant/revoke based authorization models (including the new
sql-standards based authorization in hive 0.13) does not automatically
ensure that the user has necessary privileges on hdfs dirs and files.
To have this model work with hdfs, the usual strategy is to have all
users go through hiveserver2. HiveServer2 is configured with
hive.server2.doAs=false, and then you give permissions on hdfs to the
user hiveserver2 is running as.




On Sun, Jun 15, 2014 at 8:27 PM, Apple Wang apple.wang...@gmail.com wrote:
 Hi, Thejas

 I'm a user of Hive and I'm confused with Hive authorization under hdfs
 permission. I know you are an expert of it. Could you please help me with
 the following problems?

 I have enabled hive authorization in my testing cluster(Hive 0.12). I use
 the user hive to create database hivedb and grant create privilege on hivedb
 to user root.

 But I come across the following problem that root can not create table in
 hivedb even it has the create privilege.

 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception:
 org.apache.hadoop.security.AccessControlException Permission denied:
 user=root, access=WRITE,
 inode=/tmp/user/hive/warehouse/hivedb.db:hive:hadoop:drwxr-xr-x
 at
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
 at
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:214)
 at
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:158)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5499)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5481)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5455)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3455)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3425)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3397)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:724)
 at
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
 at
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48089)
 at
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)


 It is obviously that the hivedb.db directory in HDFS are not allowed to be
 written by other user. So how does hive authorization work under the HDFS
 permissions?

 PS. if I create a table by user hive and grant update privilege to user
 root. The same ERROR will come across if I load data into the table by root.

 Looking forward to your reply!

 Thanks

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

CVE-2014-0228: Apache Hive Authorization vulnerability

2014-06-12 Thread Thejas Nair

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

CVE-2014-0228: Apache Hive Authorization vulnerability

Severity: Moderate

Vendor: The Apache Software Foundation

Versions affected: Apache Hive 0.13.0

Users affected: Users who have enabled SQL standards based authorization mode.

Description:
In SQL standards based authorization mode, the URIs used in Hive
queries are expected to be authorized on the file system permissions.
However, the directory used in import/export statements is not being
authorized. This allows a user who knows the directory to which data
has been exported to import that data into his table. This is possible
if the user HiveServer2 runs as has permissions for that directory and
its contents.

Mitigation: Users who use SQL standards based authorization should
upgrade to 0.13.1.

Credit: This issue was discovered by Thejas Nair of Hortonworks.
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.20 (Darwin)
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJTmiJUAAoJENkN9OKO5uMpHmMQAJvyHJetKGdznknT9491liQu
6M0EXQq0dVXWFc5nOzCu9CvuBZgBDeCkxKHM8M/4373clyoxOVGeehxrj0VB4aY8
BPcRDcwY+m16HF1j8W4xSiSFWRtFwedgY7seez9lHihBS0tJmsZ3xYV3mIzgUKVf
MkwimimgraQ/Z9Hh5pMuC0IEhk2K8gcGMEOZwYR2VeCI8ycpkAE8Ykx7zABL9Cpa
fS5elrGwL1kQ2fCUu+c4UJG8MmNjxWiVohtnmz5VQR7FkJUMirSK4onta7stH7Lx
NhibY9ENPmRMwpR0UbEfNOxIm4qvIZL38qNb+DqYZ5s+idoNifdW5MBp0DTxy8NI
t9diPNnSqoyZ1wsQckta76NodHKUlcxBKEIgdtSFG0qKKc8tcUTCcW8hfUTvrov/
D29w98Ap2FTHX7O6iAxl+G8JGy01n2j3m3QwQeSYqUwcub7HRb2Dneb92V/1VX5C
/z8BEnn1IohEYWSUKDyPNwG41/+oM5BUBGr9uPSA79+kvYeaaL2cVn7Csi3H3U2x
fDrQEvBhiptGjX0aS9WWhoeuCUF+PROTN7izFKDtnXJYhd3KqWFj6ccgP3aybVlk
iGoekwy5Pp44z9FZzMCibX19qi8ZbAU97lujZXvw9Bn2U+NchXbVEKjlDStlhoom
ieaMv2ISHo/5eUqh5kDj
=ZFSB
-END PGP SIGNATURE-

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Metastore 0.13 is not starting up.

2014-05-30 Thread Thejas Nair

Have you tried using schematool to upgrade the metastore schema ?
https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool

On Wed, May 21, 2014 at 4:29 AM, Biswajit Nayak
biswajit.na...@inmobi.com wrote:
 Hi All,

 I had metastore(0.12) running previously. But after upgrading to 0.13 it is
 failing with below error message. The upgrade was a clean new setup.

 Additional details:-
 Mysql Version:- 5.6
 Mysql Connector :- 5.1.30


 Starting Hive Metastore Server

 log4j:WARN No such property [maxBackupIndex] in
 org.apache.log4j.DailyRollingFileAppender.

 javax.jdo.JDOUserException: Could not create increment/table
 value-generation container `SEQUENCE_TABLE` since autoCreate flags do not
 allow it.

 at
 org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:549)

 at
 org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)

 at
 org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)

 at
 org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:458)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 at java.lang.reflect.Method.invoke(Method.java:597)

 at
 org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)

 at $Proxy4.createDatabase(Unknown Source)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:509)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:524)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:398)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:357)

 at
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.init(RetryingHMSHandler.java:54)

 at
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4967)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:5187)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5107)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 at java.lang.reflect.Method.invoke(Method.java:597)

 at org.apache.hadoop.util.RunJar.main(RunJar.java:197)

 NestedThrowablesStackTrace:

 Could not create increment/table value-generation container
 `SEQUENCE_TABLE` since autoCreate flags do not allow it.

 org.datanucleus.exceptions.NucleusUserException: Could not create
 increment/table value-generation container `SEQUENCE_TABLE` since
 autoCreate flags do not allow it.

 at
 org.datanucleus.store.rdbms.valuegenerator.TableGenerator.createRepository(TableGenerator.java:261)

 at
 org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obtainGenerationBlock(AbstractRDBMSGenerator.java:162)

 at
 org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerationBlock(AbstractGenerator.java:197)

 at
 org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractGenerator.java:105)

 at
 org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGenerator(RDBMSStoreManager.java:2005)

 at
 org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1386)

 at
 org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3827)

 at
 org.datanucleus.state.JDOStateManager.setIdentity(JDOStateManager.java:2571)

 at
 org.datanucleus.state.JDOStateManager.initialiseForPersistentNew(JDOStateManager.java:513)

 at
 org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:232)

 at
 org.datanucleus.ExecutionContextImpl.newObjectProviderForPersistentNew(ExecutionContextImpl.java:1414)

 at
 org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2218)

 at
 org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:2065)

 at
 org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1913)

 at
 org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)

 at
 org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727)

 at
 org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)

 at
 org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:458)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at

Re: OrcOutputFormat

2014-05-15 Thread Thejas Nair

I am not familiar with https://github.com/mayanhui/hive-orc-mr/. But
is there any reason why you are not using hcatalog input/output format
for this ?  
https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput.

On Wed, Apr 30, 2014 at 4:25 AM, Seema Datar sda...@yahoo-inc.com wrote:
 Hi All,

 Does anybody have ideas to solve this issue?

 Thanks,
 Seema

 From: Seema Datar sda...@yahoo-inc.com
 Date: Tuesday, April 29, 2014 at 11:10 PM

 To: user@hive.apache.org user@hive.apache.org
 Subject: Re: OrcOutputFormat

 Hi Abhishek,

 I was referring to the link below and was trying to do something similar.

 https://github.com/mayanhui/hive-orc-mr/

 This package does not seem to use Hcatalog.

 Thanks,
 Seema


 From: Abhishek Girish agir...@ncsu.edu
 Reply-To: user@hive.apache.org user@hive.apache.org
 Date: Tuesday, April 29, 2014 at 10:38 PM
 To: user@hive.apache.org user@hive.apache.org
 Subject: Re: OrcOutputFormat

 Hi,

 AFAIK, you would need to use HCatalog APIs to read-from/write-to an ORCFile.
 Please refer to
 https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput

 -Abhishek



 On Tue, Apr 29, 2014 at 6:40 AM, Seema Datar sda...@yahoo-inc.com wrote:

 Hi,

 I am trying to run an MR job to write files in ORC format.  I do not see
 any files created although the job runs successfully. If I change the output
 format from OrcOutputFormat to TextOutputFormat (and that being the only
 change), I see the output files getting created. I am using Hive-0.12.0. I
 tried upgrading to Hive 0.13.0 but with this version I get the following
 error -

 2014-04-29 10:37:07,426 FATAL [main] org.apache.hadoop.mapred.YarnChild:
 Error running child : java.lang.VerifyError:
 org/apache/hadoop/hive/ql/io/orc/OrcProto$RowIndex
  at
 org.apache.hadoop.hive.ql.io.orc.WriterImpl.init(WriterImpl.java:129)
  at
 org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:369)
  at
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:104)
  at
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:91)
  at
 org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:784)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:411)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

 How do you think can this issue be resolved?


 Thanks,

 Seema



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: MapReduce with HCatalog hangs

2014-05-02 Thread Thejas Nair

HcatInputFormat does not run any initial mapreduce jobs. It seems to
me that the MapReduce job actually ran.
You might want to do a jstack on your java program client side, to see
what it is waiting on.


On Fri, May 2, 2014 at 7:28 AM, Fabian Reinartz
fab.ek...@googlemail.com wrote:
 I implemented a MapReduce job with HCatalog as input and output. It's pretty
 much the same as the example on the website.

 If I start my job with `hadoop jar` an initial MapReduce is performed
 (which, I guess is the query for the HCatalog data as the setup method in my
 mapper is not executed). After that MapReduce no further output happens (for
 hours, so pretty sure it hangs).

 The output of that initial MapReduce contains:

 Map-Reduce Framework
 Map input records=23700
 Map output records=0


 So apparently all records of my data are read (but not passed on after
 that?).
 Any ideas what the problem could be?

 The input schema for the job is correct, the records are initially read but
 my mapper is never executed.

 I'm using Hadoop 2.4 and Hive 0.13.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive Vs Pig: Master's thesis

2014-05-02 Thread Thejas Nair

The primary difference between hive and pig is the language. There are
implementation differences that will result in performance
differences, but it will be hard to figure out what aspect of
implementation responsible for what improvement.

I think a more interesting project would be to compare the impact of
various performance improvements in hive. There are many features that
you can turn on and off.

example -
- hive vectorization
- file format - text vs RCFile vs ORC
- compressed vs uncompressed
- mapreduce vs tez execution engine
- stats optimized queries



On Thu, May 1, 2014 at 5:47 AM, Sarfraz Ramay sarfraz.ra...@gmail.com wrote:

 Hi,

 It seems that both Hive and Pig are used for managing large data sets.
 Hive is more SQL oriented whereas Pig is more for the data flows. I am doing
 a master's thesis on the performance evaluation of both. Can some please
 provide a list of tasks that would make for an interesting comparison ?


 What is Hive good at ?

 What is Pig good at ?

 Ideally, i would like to take what Hive is good at and test it in Pig and
 vice versa. The competitive characteristics  would make for an interesting
 comparison.




 Regards,
 Sarfraz Rasheed Ramay (DIT)
 Dublin, Ireland.



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: What is the minimal required version of Hadoop for Hive 0.13.0?

2014-04-23 Thread Thejas Nair

There is a jira for the hadoop 1.0.x compatibility issue.
https://issues.apache.org/jira/browse/HIVE-6962
I have suggested a possible workaround there. There is no patch for it yet.

I am planning to propose a 0.13.1 release, primarily for the issues
around use of Oracle as metastore database, and one in SQL standard
authorization (HIVE-6945, HIVE-6919).
We can also include patch to get hive working with older versions of
hadoop, specially 1.x versions.

Hive build and tests are currently run against hadoop 1.2.1 and 2.3.0
versions (as you can see in pom.xml). But I don't believe there was a
conscious decision to have 1.2.1 as the *minimum* required version.


On Wed, Apr 23, 2014 at 7:20 AM, David Gayou david.ga...@kxen.com wrote:
 I actually have pretty the same issue with Hadoop 1.1.2

 There is a jira issue opened here :
 https://issues.apache.org/jira/browse/HIVE-6962
 with a link to the issue that created our problem.

 A quick search in release notes seem's to indicate that the unset method
 appeared in the Haddop 1.2.1

 Is it now the minimal required version ?
 If not, will there be a Hive 0.13.1 for older hadoop?

 Regards,

 David


 On Wed, Apr 23, 2014 at 4:00 PM, Dmitry Vasilenko dvasi...@gmail.com
 wrote:


 Hive 0.12.0 (and previous versions) worked with Hadoop 0.20.x, 0.23.x.y,
 1.x.y, 2.x.y.

 Hive 0.13.0 did not work with Hadoop 0.20.x out of the box and to make it
 work I had to patch Hadoop installation and add Path(URL) constructor and
 Configuration.unset() method.

 After that the basic functionality seems to be working.

 Both issues originate from the org.apache.hadoop.hive.ql.exec.Utilities

 I know that Hadoop 0.20.x is old but some of us still have to work with
 that version. So does anyone know what is the minimal required version of
 Hadoop for Hive 0.13.0?

 Thanks
 Dmitry Vasilenko



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] Apache Hive 0.13.0 Released

2014-04-21 Thread Thejas Nair

Thanks to Harish for all the hard work managing and getting the release out!

This is great news! This is a significant release in hive! This has
more than twice the number of jiras included (see release note link),
compared to 0.12, and earlier releases which were also out after a
similar gap of 5-6 months. It shows tremendous growth in hive
community activity!

hive 0.13 - 1081
hive 0.12 - 439
hive 0.11 - 374

-Thejas

On Mon, Apr 21, 2014 at 3:17 PM, Harish Butani rhbut...@apache.org wrote:
The Apache Hive team is proud to announce the the release of Apache
Hive version 0.13.0.

The Apache Hive (TM) data warehouse software facilitates querying and
managing large datasets residing in distributed storage. Built on top
of Apache Hadoop (TM), it provides:

* Tools to enable easy data extract/transform/load (ETL)

* A mechanism to impose structure on a variety of data formats

* Access to files stored either directly in Apache HDFS (TM) or in other
data storage systems such as Apache HBase (TM)

* Query execution via MapReduce

For Hive release details and downloads, please visit:
http://www.apache.org/dyn/closer.cgi/hive/

Hive 0.13.0 Release Notes are available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843

We would like to thank the many contributors who made this release
possible.

Regards,

The Apache Hive Team

PS: we are having technical difficulty updating the website. Will resolve
this shortly.

Re: [ANNOUNCE] New Hive Committers - Alan Gates, Daniel Dai, and Sushanth Sowmyan

2014-04-14 Thread Thejas Nair

Congrats Alan, Daniel, Sushanth!


On Mon, Apr 14, 2014 at 11:29 AM, Szehon Ho sze...@cloudera.com wrote:
 Congratulations, guys!


 On Mon, Apr 14, 2014 at 11:14 AM, Hari Subramaniyan
 hsubramani...@hortonworks.com wrote:

 Congrats!


 On Mon, Apr 14, 2014 at 11:09 AM, Tao Li litao.bupt...@gmail.com wrote:

 Congrats！

 On Apr 15, 2014 1:55 AM, Prasanth Jayachandran
 pjayachand...@hortonworks.com wrote:

 Congratulations everyone!!

 Thanks
 Prasanth Jayachandran

 On Apr 14, 2014, at 10:51 AM, Carl Steinbach c...@apache.org wrote:

  The Apache Hive PMC has voted to make Alan Gates, Daniel Dai, and
  Sushanth
  Sowmyan committers on the Apache Hive Project.
 
  Please join me in congratulating Alan, Daniel, and Sushanth!
 
  - Carl


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.



 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader of
 this message is not the intended recipient, you are hereby notified that any
 printing, copying, dissemination, distribution, disclosure or forwarding of
 this communication is strictly prohibited. If you have received this
 communication in error, please contact the sender immediately and delete it
 from your system. Thank You.



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive jdbc access to HiveServer2. How to debug?

2014-04-11 Thread Thejas Nair

This hanging issue has been seen when there is a mismatch between the
auth modes of HS2 server and client  (usually SASL vs NOSASL).


On Sun, Apr 6, 2014 at 5:40 AM, Jay Vyas jayunit...@gmail.com wrote:
 Hi hive.

 I cant run JDBC queries against HiveServer2.  It appears the client is
 connecting to the thrift service, but after the negotiation loop is
 complete, no more messages occur, and the call to getConnection() in my
 program just hangs.

 Any thoughts on how to get more info about what is hanging after the last
 call in the logs of TSaslTransport below?

 Connect:  jdbc:hive2://localhost:1/default... getting connection
 timeout=0

 0[main] DEBUG org.apache.thrift.transport.TSaslTransport  - opening
 transport org.apache.thrift.transport.TSaslClientTransport@219ba640
 0 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - opening
 transport org.apache.thrift.transport.TSaslClientTransport@219ba640
 3[main] DEBUG org.apache.thrift.transport.TSaslClientTransport  -
 Sending mechanism name PLAIN and initial response of length 14
 3 [main] DEBUG org.apache.thrift.transport.TSaslClientTransport  - Sending
 mechanism name PLAIN and initial response of length 14
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT:
 Writing message with status START and payload length 5
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Writing
 message with status START and payload length 5
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT:
 Writing message with status COMPLETE and payload length 14
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Writing
 message with status COMPLETE and payload length 14
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT:
 Start message handled
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Start
 message handled
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Main
 negotiation loop complete
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Main
 negotiation loop complete
 6[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: SASL
 Client receiving last message
 6 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: SASL
 Client receiving last message

 (now what)? at this point, the code just hangs.

 --
 Jay Vyas
 http://jayunit100.blogspot.com

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

2014-02-28 Thread Thejas Nair

Congrats Xuefu!



On Fri, Feb 28, 2014 at 3:20 AM, Clark Yang (杨卓荦) yangzhuo...@gmail.comwrote:

 Congrats, Xuefu!

 Thanks,
 Zhuoluo (Clark) Yang


 2014-02-28 19:19 GMT+08:00 Lianhui Wang lianhuiwan...@gmail.com:

 Congrats Xuefu!


 2014-02-28 17:49 GMT+08:00 Jason Dere jd...@hortonworks.com:

  Congrats Xuefu!
 
 
  On Feb 28, 2014, at 1:43 AM, Biswajit Nayak biswajit.na...@inmobi.com
  wrote:
 
   Congrats Xuefu..
  
   With Best Regards
   Biswajit
  
   ~Biswa
   -oThe important thing is not to stop questioning o-
  
  
   On Fri, Feb 28, 2014 at 2:50 PM, Carl Steinbach c...@apache.org
 wrote:
   I am pleased to announce that Xuefu Zhang has been elected to the Hive
  Project Management Committee. Please join me in congratulating Xuefu!
  
   Thanks.
  
   Carl
  
  
  
   _
   The information contained in this communication is intended solely for
  the use of the individual or entity to whom it is addressed and others
  authorized to receive it. It may contain confidential or legally
 privileged
  information. If you are not the intended recipient you are hereby
 notified
  that any disclosure, copying, distribution or taking any action in
 reliance
  on the contents of this information is strictly prohibited and may be
  unlawful. If you have received this communication in error, please
 notify
  us immediately by responding to this email and then delete it from your
  system. The firm is neither liable for the proper and complete
 transmission
  of the information contained in this communication nor for any delay in
 its
  receipt.
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 



 --
 thanks

 王联辉(Lianhui Wang)
 blog; http://blog.csdn.net/lance_123
 兴趣方向：数据库，分布式，数据挖掘，编程语言，互联网技术等




-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive Committer - Remus Rusanu

2014-02-26 Thread Thejas Nair

Congrats Remus!



On Wed, Feb 26, 2014 at 9:08 AM, Clay McDonald 
stuart.mcdon...@bateswhite.com wrote:

 Congratulations Remus!

 Clay

 -Original Message-
 From: Jarek Jarcec Cecho [mailto:jar...@apache.org]
 Sent: Wednesday, February 26, 2014 12:06 PM
 To: user@hive.apache.org
 Cc: d...@hive.apache.org; rem...@microsoft.com
 Subject: Re: [ANNOUNCE] New Hive Committer - Remus Rusanu

 Congratulations Remus, good work!

 Jarcec

 On Wed, Feb 26, 2014 at 08:58:43AM -0800, Carl Steinbach wrote:
  The Apache Hive PMC has voted to make Remus Rusanu a committer on the
  Apache Hive Project.
 
  Please join me in congratulating Remus!
 
  Thanks.
 
  Carl


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Thejas Nair

Congrats Jason and Sergey!
Well deserved!
Looking forward to your help in getting the patch available counts
down (its at 225 now)!


On Mon, Jan 27, 2014 at 10:55 AM, Vaibhav Gumashta
vgumas...@hortonworks.com wrote:
 Congrats Sergey and Jason!

 --Vaibhav


 On Mon, Jan 27, 2014 at 10:47 AM, Vikram Dixit vik...@hortonworks.com
 wrote:

 Congrats Sergey and Jason!

 Thanks
 Vikram.

 On Jan 27, 2014, at 8:36 AM, Carl Steinbach wrote:

  The Apache Hive PMC has voted to make Sergey Shelukhin and Jason Dere
  committers on the Apache Hive Project.
 
  Please join me in congratulating Sergey and Jason!
 
  Thanks.
 
  Carl


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.



 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader of
 this message is not the intended recipient, you are hereby notified that any
 printing, copying, dissemination, distribution, disclosure or forwarding of
 this communication is strictly prohibited. If you have received this
 communication in error, please contact the sender immediately and delete it
 from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HIVE+MAPREDUCE

2014-01-27 Thread Thejas Nair

You can use hcatalog to write into hive tables from mapreduce.
See https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput .
Another example is in https://gist.github.com/thejasmn/7607406


On Tue, Jan 21, 2014 at 12:21 AM, Ranjini Rathinam
ranjinibe...@gmail.com wrote:
 Hi,

 Need to load the data into hive table using mapreduce, using java.

 Please suggest the code related to hive +mapreduce.



 Thanks in advance

 Ranjini R



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2014-01-15 Thread Thejas Nair

Adding another sentence to clarify that with a -1, the patch can be
reverted If the code has been committed before the -1, the code can
be reverted until the vote is over.

Approval :  Code Change : The code can be committed after the first
+1. Committers should wait for reasonable time after patch is
available so that other committers have had a chance to look at it. If
a -1 is received and an agreement is not reached among the committers
on how to resolve the issue, lazy majority with a voting period of 7
days will be used. If the code has been committed before the -1, the
code can be reverted until the vote is over.


Carl,
People seem to agree (and other people seem to be OK, considering the
silence). Can you please include this in the by-law changes being
proposed and put it to vote ?

Thanks,
Thejas




On Tue, Jan 14, 2014 at 11:05 PM, Lefty Leverenz
leftylever...@gmail.com wrote:
 This wording seems fine.  You could add a here:  Committers should wait
 for [a] reasonable time

 The guidance is good.

 +1

 -- Lefty


 On Tue, Jan 14, 2014 at 7:53 PM, Thejas Nair the...@hortonworks.com wrote:

 I guess the silence from others on the changing the '24 hours from +1'
 to a guidance of '24 hours from patch available', implies they are OK
 with this change.

 Proposed general guidance for commits for committers: Wait for 24
 hours from the time a patch is made 'patch available'  before doing a
 +1 and committing, so that other committers have had sufficient time
 to look at the patch. If the patch is trivial and safe changes such as
 a small bug fix, improvement in error message or an incremental
 documentation change, it is OK to not wait for 24 hours. For
 significant changes the wait should be for a couple of days. If patch
 is updated the new patch is significantly different from old one, the
 wait should start from the time the new patch is uploaded. Use your
 discretion to decide if it would be useful to wait longer than 24
 hours on a weekend or holiday for that patch.

 Proposed change in by-law : (if someone can word it better, that would
 be great!)

 Action : Code Change : A change made to a codebase of the project and
 committed by a committer. This includes source code, documentation,
 website content, etc.

 Approval :  Code Change : The code can be committed after the first
 +1. Committers should wait for reasonable time after patch is
 available so that other committers have had a chance to look at it. If
 a -1 is received and an agreement is not reached among the committers
 on how to resolve the issue, lazy majority with a voting period of 7
 days will be used.

 Minimum Length : Code Change : 7 days on a -1.


 On Tue, Jan 14, 2014 at 6:25 PM, Vikram Dixit vik...@hortonworks.com
 wrote:
  I think there is value in having some changes committed in less than 24
  hours. Particularly for minor changes. Also reverting of patches makes
  sense. Although it could be cumbersome, it is not much worse than what
  would happen now incase of a bad commit. Anyway we wait for the unit
 tests
  to complete at the very least.
 
  I am +1 on Thejas' proposal.
 
 
  On Tue, Jan 7, 2014 at 7:01 PM, Thejas Nair the...@hortonworks.com
 wrote:
 
  After thinking some more about it, I am not sure if we need to have a
  hard and fast rule of 24 hours before commit. I think we should let
  committers make a call on if this is a trivial, safe and non
  controversial change and commit it in less than 24 hours in such
  cases. In case of larger changes, waiting for couple of days for
  feedback makes sense.
  If a committer feel that a patch shouldn't have gone in (because of
  technical issues or it went it too soon), they should be able to -1 it
  and revert the patch, until further review is done.
 
  In other words, I think this can be a guidance instead of a law in the
  by-laws. What do others in hive community think about this ?
 
  This has been working well in case of other apache hadoop related
 projects.
 
 
  On Fri, Dec 27, 2013 at 2:28 PM, Sergey Shelukhin
  ser...@hortonworks.com wrote:
   I actually have a patch out on a jira that says it will be committed
 in
  24
   hours from long ago ;)
  
   Is 24h rule is needed at all? In other projects, I've seen patches
 simply
   reverted by author (or someone else). It's a rare occurrence, and it
  should
   be possible to revert a patch if someone -1s it after commit, esp.
 within
   the same 24 hours when not many other changes are in.
  
  
   On Fri, Dec 27, 2013 at 1:03 PM, Thejas Nair the...@hortonworks.com
  wrote:
  
   I agree with Ashutosh that the 24 hour waiting period after +1 is
   cumbersome, I have also forgotten to commit patches after +1,
   resulting in patches going stale.
  
   But I think 24 hours wait between creation of jira and patch commit
 is
   not very useful, as the thing to be examined is the patch and not the
   jira summary/description.
   I think having a waiting period of 24 hours between a jira being made

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2014-01-08 Thread Thejas Nair

More thoughts on the 24 hour wait : Changing the by-law to a 24 hr
wait from first time patch is marked as available (or making this a
guidance instead of by-law), is likely to nudge committers to review
patches sooner. Right now, the clock starts ticking for a commit when
another committer has +1'd. With the change the clock starts ticking
when patch is available (ie, controlled by contributor). I think this
'small' change will improver things for better in a bigger way.



On Tue, Jan 7, 2014 at 7:01 PM, Thejas Nair the...@hortonworks.com wrote:
 After thinking some more about it, I am not sure if we need to have a
 hard and fast rule of 24 hours before commit. I think we should let
 committers make a call on if this is a trivial, safe and non
 controversial change and commit it in less than 24 hours in such
 cases. In case of larger changes, waiting for couple of days for
 feedback makes sense.
 If a committer feel that a patch shouldn't have gone in (because of
 technical issues or it went it too soon), they should be able to -1 it
 and revert the patch, until further review is done.

 In other words, I think this can be a guidance instead of a law in the
 by-laws. What do others in hive community think about this ?

 This has been working well in case of other apache hadoop related projects.


 On Fri, Dec 27, 2013 at 2:28 PM, Sergey Shelukhin
 ser...@hortonworks.com wrote:
 I actually have a patch out on a jira that says it will be committed in 24
 hours from long ago ;)

 Is 24h rule is needed at all? In other projects, I've seen patches simply
 reverted by author (or someone else). It's a rare occurrence, and it should
 be possible to revert a patch if someone -1s it after commit, esp. within
 the same 24 hours when not many other changes are in.


 On Fri, Dec 27, 2013 at 1:03 PM, Thejas Nair the...@hortonworks.com wrote:

 I agree with Ashutosh that the 24 hour waiting period after +1 is
 cumbersome, I have also forgotten to commit patches after +1,
 resulting in patches going stale.

 But I think 24 hours wait between creation of jira and patch commit is
 not very useful, as the thing to be examined is the patch and not the
 jira summary/description.
 I think having a waiting period of 24 hours between a jira being made
 'patch available' and committing is better and sufficient.


 On Fri, Dec 27, 2013 at 11:44 AM, Ashutosh Chauhan hashut...@apache.org
 wrote:
  Proposed changes look good to me, both suggested by Carl and Thejas.
  Another one I would like to add for consideration is: 24 hour rule
  between
  +1 and commit. Since this exists only in Hive (no other apache project
  which I am aware of) this surprises new contributors. More importantly,
  I
  have seen multiple cases where patch didn't get committed because
  committer
  after +1 forgot to commit after 24 hours have passed. I propose to
  modify
  that one such that there must be 24 hour duration between creation of
  jira
  and patch commit, that will ensure that there is sufficient time for
  folks
  to see changes which are happening on trunk.
 
  Thanks,
  Ashutosh
 
 
  On Fri, Dec 27, 2013 at 9:33 AM, Thejas Nair the...@hortonworks.com
  wrote:
 
  The changes look good to me.
  Only concern I have is with the 7 days for release candidate voting.
  Based on my experience with releases, it often takes few cycles to get
  the candidate out, and people tend to vote closer to the end of the
  voting period. This can mean that it takes several weeks to get a
  release out. But this will not be so much of a problem as long as
  people don't wait for end of the voting period to vote, or if they
  look at the candidate branch even before the release candidate is out.
 
  Should we also include a provision for branch merges ? I think we
  should have a longer voting period for branch merges (3 days instead
  of 1?) and require 3 +1s (this part is also in the hadoop by-law ) .
 
 
  On Thu, Dec 26, 2013 at 7:08 PM, Carl Steinbach c...@apache.org wrote:
   I think we should make several changes to the Apache Hive Project
   Bylaws.
   The proposed changes are available for review here:
  
  
 
  https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38568856
  
   Most of the changes were directly inspired by provisions found in the
  Apache
   Hadoop Project Bylaws.
  
   Summary of proposed changes:
  
   * Add provisions for branch committers and speculative branches.
  
   * Define the responsibilities of a release manager.
  
   * PMC Chairs serve for one year and are elected by the PMC using
   Single
   Transferable Vote (STV) voting.
  
   * With the exception of code change votes, the minimum length of all
  voting
   periods is extended to seven days.
  
   Thanks.
  
   Carl
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2014-01-07 Thread Thejas Nair

After thinking some more about it, I am not sure if we need to have a
hard and fast rule of 24 hours before commit. I think we should let
committers make a call on if this is a trivial, safe and non
controversial change and commit it in less than 24 hours in such
cases. In case of larger changes, waiting for couple of days for
feedback makes sense.
If a committer feel that a patch shouldn't have gone in (because of
technical issues or it went it too soon), they should be able to -1 it
and revert the patch, until further review is done.

In other words, I think this can be a guidance instead of a law in the
by-laws. What do others in hive community think about this ?

This has been working well in case of other apache hadoop related projects.


On Fri, Dec 27, 2013 at 2:28 PM, Sergey Shelukhin
ser...@hortonworks.com wrote:
 I actually have a patch out on a jira that says it will be committed in 24
 hours from long ago ;)

 Is 24h rule is needed at all? In other projects, I've seen patches simply
 reverted by author (or someone else). It's a rare occurrence, and it should
 be possible to revert a patch if someone -1s it after commit, esp. within
 the same 24 hours when not many other changes are in.


 On Fri, Dec 27, 2013 at 1:03 PM, Thejas Nair the...@hortonworks.com wrote:

 I agree with Ashutosh that the 24 hour waiting period after +1 is
 cumbersome, I have also forgotten to commit patches after +1,
 resulting in patches going stale.

 But I think 24 hours wait between creation of jira and patch commit is
 not very useful, as the thing to be examined is the patch and not the
 jira summary/description.
 I think having a waiting period of 24 hours between a jira being made
 'patch available' and committing is better and sufficient.


 On Fri, Dec 27, 2013 at 11:44 AM, Ashutosh Chauhan hashut...@apache.org
 wrote:
  Proposed changes look good to me, both suggested by Carl and Thejas.
  Another one I would like to add for consideration is: 24 hour rule
  between
  +1 and commit. Since this exists only in Hive (no other apache project
  which I am aware of) this surprises new contributors. More importantly,
  I
  have seen multiple cases where patch didn't get committed because
  committer
  after +1 forgot to commit after 24 hours have passed. I propose to
  modify
  that one such that there must be 24 hour duration between creation of
  jira
  and patch commit, that will ensure that there is sufficient time for
  folks
  to see changes which are happening on trunk.
 
  Thanks,
  Ashutosh
 
 
  On Fri, Dec 27, 2013 at 9:33 AM, Thejas Nair the...@hortonworks.com
  wrote:
 
  The changes look good to me.
  Only concern I have is with the 7 days for release candidate voting.
  Based on my experience with releases, it often takes few cycles to get
  the candidate out, and people tend to vote closer to the end of the
  voting period. This can mean that it takes several weeks to get a
  release out. But this will not be so much of a problem as long as
  people don't wait for end of the voting period to vote, or if they
  look at the candidate branch even before the release candidate is out.
 
  Should we also include a provision for branch merges ? I think we
  should have a longer voting period for branch merges (3 days instead
  of 1?) and require 3 +1s (this part is also in the hadoop by-law ) .
 
 
  On Thu, Dec 26, 2013 at 7:08 PM, Carl Steinbach c...@apache.org wrote:
   I think we should make several changes to the Apache Hive Project
   Bylaws.
   The proposed changes are available for review here:
  
  
 
  https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38568856
  
   Most of the changes were directly inspired by provisions found in the
  Apache
   Hadoop Project Bylaws.
  
   Summary of proposed changes:
  
   * Add provisions for branch committers and speculative branches.
  
   * Define the responsibilities of a release manager.
  
   * PMC Chairs serve for one year and are elected by the PMC using
   Single
   Transferable Vote (STV) voting.
  
   * With the exception of code change votes, the minimum length of all
  voting
   periods is extended to seven days.
  
   Thanks.
  
   Carl
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
  reader
  of this message is not the intended recipient, you are hereby notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged

Re: [ANNOUNCE] New Hive Committer - Vikram Dixit

2014-01-06 Thread Thejas Nair

Congrats Vikram!


On Mon, Jan 6, 2014 at 9:01 AM, Jarek Jarcec Cecho jar...@apache.org wrote:
 Congratulations Vikram!

 Jarcec

 On Mon, Jan 06, 2014 at 08:58:06AM -0800, Carl Steinbach wrote:
 The Apache Hive PMC has voted to make Vikram Dixit a committer on the
 Apache Hive Project.

 Please join me in congratulating Vikram!

 Thanks.

 Carl

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2013-12-29 Thread Thejas Nair

On Sun, Dec 29, 2013 at 12:06 AM, Lefty Leverenz
leftylever...@gmail.com wrote:
 Let's discuss annual rotation of the PMC chair a bit more.  Although I
 agree with the points made in favor, I wonder about frequent loss of
 expertise and needing to establish new relationships.  What's the ramp-up
 time?

The ramp up time is not significant, as you can see from the list of
responsibilities mentioned here -
http://www.apache.org/dev/pmc.html#chair .
We have enough people in PMC who have been involved with Apache
project for long time and are familiar with apache bylaws and way of
doing things. Also, the former PMC chairs are likely to be around to
help as needed.

 Could a current chair be chosen for another consecutive term?  Could two
 chairs alternate years indefinitely?
I would take the meaning of rotation to mean that we have a new chair
for the next term. I think it should be OK to have same chair in
alternative year. 2 years is a long time and it sounds reasonable
given the size of the community ! :)

  Do many other projects have annual rotations?
Yes, at least hadoop and pig project have that.  I could not find
by-laws pages easily for other projects.


 Would it be inconvenient to change chairs in the middle of a release?
No. The PMC Chair position does not have any special role in a release.

 And now to trivialize my comments:  while making other changes, let's fix
 this typo:  Membership of the PMC can be revoked by an unanimous vote
 ... *(should
 be a unanimous ... just like a university because the rule is based on
 sound, not spelling)*.

I think you should feel free to fix such a typos in this wiki without
a vote on it ! :)

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2013-12-27 Thread Thejas Nair

The changes look good to me.
Only concern I have is with the 7 days for release candidate voting.
Based on my experience with releases, it often takes few cycles to get
the candidate out, and people tend to vote closer to the end of the
voting period. This can mean that it takes several weeks to get a
release out. But this will not be so much of a problem as long as
people don't wait for end of the voting period to vote, or if they
look at the candidate branch even before the release candidate is out.

Should we also include a provision for branch merges ? I think we
should have a longer voting period for branch merges (3 days instead
of 1?) and require 3 +1s (this part is also in the hadoop by-law ) .


On Thu, Dec 26, 2013 at 7:08 PM, Carl Steinbach c...@apache.org wrote:
 I think we should make several changes to the Apache Hive Project Bylaws.
 The proposed changes are available for review here:

 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38568856

 Most of the changes were directly inspired by provisions found in the Apache
 Hadoop Project Bylaws.

 Summary of proposed changes:

 * Add provisions for branch committers and speculative branches.

 * Define the responsibilities of a release manager.

 * PMC Chairs serve for one year and are elected by the PMC using Single
 Transferable Vote (STV) voting.

 * With the exception of code change votes, the minimum length of all voting
 periods is extended to seven days.

 Thanks.

 Carl

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2013-12-27 Thread Thejas Nair

I agree with Ashutosh that the 24 hour waiting period after +1 is
cumbersome, I have also forgotten to commit patches after +1,
resulting in patches going stale.

But I think 24 hours wait between creation of jira and patch commit is
not very useful, as the thing to be examined is the patch and not the
jira summary/description.
I think having a waiting period of 24 hours between a jira being made
'patch available' and committing is better and sufficient.


On Fri, Dec 27, 2013 at 11:44 AM, Ashutosh Chauhan hashut...@apache.org wrote:
 Proposed changes look good to me, both suggested by Carl and Thejas.
 Another one I would like to add for consideration is: 24 hour rule between
 +1 and commit. Since this exists only in Hive (no other apache project
 which I am aware of) this surprises new contributors. More importantly, I
 have seen multiple cases where patch didn't get committed because committer
 after +1 forgot to commit after 24 hours have passed. I propose to modify
 that one such that there must be 24 hour duration between creation of jira
 and patch commit, that will ensure that there is sufficient time for folks
 to see changes which are happening on trunk.

 Thanks,
 Ashutosh


 On Fri, Dec 27, 2013 at 9:33 AM, Thejas Nair the...@hortonworks.com wrote:

 The changes look good to me.
 Only concern I have is with the 7 days for release candidate voting.
 Based on my experience with releases, it often takes few cycles to get
 the candidate out, and people tend to vote closer to the end of the
 voting period. This can mean that it takes several weeks to get a
 release out. But this will not be so much of a problem as long as
 people don't wait for end of the voting period to vote, or if they
 look at the candidate branch even before the release candidate is out.

 Should we also include a provision for branch merges ? I think we
 should have a longer voting period for branch merges (3 days instead
 of 1?) and require 3 +1s (this part is also in the hadoop by-law ) .


 On Thu, Dec 26, 2013 at 7:08 PM, Carl Steinbach c...@apache.org wrote:
  I think we should make several changes to the Apache Hive Project Bylaws.
  The proposed changes are available for review here:
 
 
 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38568856
 
  Most of the changes were directly inspired by provisions found in the
 Apache
  Hadoop Project Bylaws.
 
  Summary of proposed changes:
 
  * Add provisions for branch committers and speculative branches.
 
  * Define the responsibilities of a release manager.
 
  * PMC Chairs serve for one year and are elected by the PMC using Single
  Transferable Vote (STV) voting.
 
  * With the exception of code change votes, the minimum length of all
 voting
  periods is extended to seven days.
 
  Thanks.
 
  Carl

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Using Hive with WebHCat

2013-12-03 Thread Thejas Nair

Can you try setting  templeton.storage.root in webhcat-site.xml to a
directory that exists ?


On Mon, Dec 2, 2013 at 6:21 AM, Jonathan Hodges hodg...@gmail.com wrote:
 Hi,

 I have setup WebHCat that is bundled with Hive 0.11.0.  I am able to kick of
 map reduce jobs with the REST API successfully.  However I am having some
 issues with Hive commands over REST.  The following is my
 $TEMPLETON_HOME/webhcat-site.xml.


 ?xml version=1.0 encoding=UTF-8?
 !-- The default settings for Templeton. --
 !-- Edit templeton-site.xml to change settings for your local --
 !-- install. --

 configuration

   property
 nametempleton.pig.archive/name

 values3n://pearson-alto-hadoop/apps/webhcat/archives/pig-0.11.1.tar.gz/value
 descriptionThe path to the Pig archive./description
   /property

   property
 nametempleton.pig.path/name
 valuepig-0.11.1.tar.gz/pig-0.11.1/bin/pig/value
 descriptionThe path to the Pig executable./description
   /property

   property
 nametempleton.hive.archive/name

 values3n://pearson-alto-hadoop/apps/webhcat/archives/hive-0.11.0.tar.gz/value
 descriptionThe path to the Hive archive./description
   /property

   property
 nametempleton.hive.path/name
 valuehive-0.11.0.tar.gz/hive-0.11.0-bin/bin/hive/value
 descriptionThe path to the Hive executable./description
   /property

 /configuration


 curl -s -d user.name=hadoop \
-d execute=show+tables; \
-d statusdir=s3n://pearson-alto-hadoop/webhcat/hive \
'http://10.201.5.28:50111/templeton/v1/hive'
 {id:job_201311281741_0020}


 When I check the statusdir and jobs folders I see the job had an exit status
 of 1 so it wasn't successful.

 hadoop fs -ls s3n://pearson-alto-hadoop/webhcat/hive
 Found 1 items
 -rwxrwxrwx   1  2 2013-11-29 15:15 /webhcat/hive/exit


 hadoop fs -ls /templeton-hadoop/jobs/job_201311281741_0020
 Found 3 items
 -rw-r--r--   1 hadoop supergroup  4 2013-11-29 15:15
 /templeton-hadoop/jobs/job_201311281741_0020/completed
 -rw-r--r--   1 hadoop supergroup  1 2013-11-29 15:15
 /templeton-hadoop/jobs/job_201311281741_0020/exitValue
 -rw-r--r--   1 hadoop supergroup  6 2013-11-29 15:15
 /templeton-hadoop/jobs/job_201311281741_0020/user

 Here is what I see in the logs.

 DEBUG | 29 Nov 2013 15:15:36,133 | org.apache.hcatalog.templeton.Server |
 queued job job_201311281741_0020 in 13403 ms
 DEBUG | 29 Nov 2013 15:16:09,583 |
 org.apache.hcatalog.templeton.tool.HDFSStorage | Couldn't find
 /templeton-hadoop/jobs/job_201311281741_0020/notified: File does not exist:
 /templeton-hadoop/jobs/job_201311281741_0020/notified
 DEBUG | 29 Nov 2013 15:16:09,584 |
 org.apache.hcatalog.templeton.tool.HDFSStorage | Couldn't find
 /templeton-hadoop/jobs/job_201311281741_0020/callback: File does not exist:
 /templeton-hadoop/jobs/job_201311281741_0020/callback


 How do I figure out the reason for failure?

 Thanks,
 Jonathan

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson

2013-11-21 Thread Thejas Nair

Congrats!


On Thu, Nov 21, 2013 at 3:46 PM, Shreepadma Venugopalan
shreepa...@cloudera.com wrote:
 Congrats guys!


 On Thu, Nov 21, 2013 at 3:37 PM, Vinod Kumar Vavilapalli vino...@apache.org
 wrote:

 Congratulations to both! Great job and keep up the good work!

 Thanks,
 +Vinod

 On Nov 21, 2013, at 3:29 PM, Carl Steinbach wrote:

  The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric
 Hanson
  committers on the Apache Hive project.
 
  Please join me in congratulating Jitendra and Eric!
 
  Thanks.
 
  Carl


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Difference in number of row observstions from distinct and group by

2013-11-21 Thread Thejas Nair

You probably have 400 rows where col1, col2 and col3 have null values.
count(distinct col1,col2,col3)  will not count those rows.


On Thu, Nov 21, 2013 at 7:13 AM, Mayank Bansal
mayank.ban...@mu-sigma.com wrote:
 Hi,



 I have a table which has 3 columns combined together to form a primary key.
 If I do



 Select count(distinct col1,col2,col3) from table_name;



 And



 Select count(a.*) from (select col1,col2,col3,count(*) from table_name group
 by col1,col2,col3)a ;



 While running the first query, the count of rows that I get is 400 less than
 what I get by the second query.

 Can someone please explain to me the difference in number of observations
 from both the queries?



 Thanks

 Mayank

 This email message may contain proprietary, private and confidential
 information. The information transmitted is intended only for the person(s)
 or entities to which it is addressed. Any review, retransmission,
 dissemination or other use of, or taking of any action in reliance upon,
 this information by persons or entities other than the intended recipient is
 prohibited and may be illegal. If you received this in error, please contact
 the sender and delete the message from your system. Mu Sigma takes all
 reasonable steps to ensure that its electronic communications are free from
 viruses. However, given Internet accessibility, the Company cannot accept
 liability for any virus introduced by this e-mail or any attachment and you
 are advised to use up-to-date virus checking software.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive Committer and PMC Member - Lefty Leverenz

2013-11-17 Thread Thejas Nair

Congrats Lefty!


On Sat, Nov 16, 2013 at 9:20 PM, Carl Steinbach c...@apache.org wrote:
 The Apache Hive PMC has voted to make Lefty Leverenz a committer and PMC
 member on the Apache Hive Project.

 Please join me in congratulating Lefty!

 Thanks.

 Carl

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive PMC Member - Harish Butani

2013-11-15 Thread Thejas Nair

Congrats Harish!


On Fri, Nov 15, 2013 at 1:08 AM, Gunther Hagleitner
ghagleit...@hortonworks.com wrote:
 Congratulations Harish! Very cool.


 On Thu, Nov 14, 2013 at 10:01 PM, Prasad Mujumdar pras...@cloudera.comwrote:

 Congratulations !!

 thanks
 Prasad



 On Thu, Nov 14, 2013 at 5:17 PM, Carl Steinbach c...@apache.org wrote:

  I am pleased to announce that Harish Butani has been elected to the Hive
  Project Management Committee. Please join me in congratulating Harish!
 
  Thanks.
 
  Carl
 


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Running Hive 0.12.0 with Hadoop 2.2

2013-11-15 Thread Thejas Nair

It is better to first upgrade hive 0.12 to protocol buffer 2.5 first -
https://issues.apache.org/jira/browse/HIVE-5112. This will make it
into hive 0.13 release.
If you want a pre-built hive 0.12 that works with hadoop 2.2.x , you
can download one from hortonworks website -
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap13.html
. (that has a few additional patches also, see full list here -
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_releasenotes_hdp_2.0/content/ch_relnotes-hdp2.0.6.0-hive.html
)

On Thu, Nov 14, 2013 at 8:30 PM, Bill Q bill.q@gmail.com wrote:
Hi Franks,
Many thanks for sharing the information.

Many thanks.

Bill

On Thu, Nov 14, 2013 at 4:31 PM, Frank Davidson ffdavid...@gmail.com
wrote:

Well, to be honest, it's the first time I've ever set it up, so I'm not
entirely sure what constitutes extra, but I followed the wiki...

https://cwiki.apache.org/confluence/display/Hive/GettingStarted

... and didn't seem to have too many problems... There were probably a few
settings I had to tinker with, but it seems to be working fine now...

Cheers,

Frank

On Thu, Nov 14, 2013 at 3:57 PM, Bill Q bill.q@gmail.com wrote:

Hi Frank,
Thanks for your reply. So, you don't have to do any extra work and just
install it as in Hadoop 1.x?

On Thursday, November 14, 2013, Frank Davidson wrote:

Yes, in fact, I just set it up last week...

On Thu, Nov 14, 2013 at 1:55 PM, Bill Q bill.q@gmail.com wrote:

Anyone knows? Many thanks.

On Thursday, November 14, 2013, Bill Q wrote:

Hi,
Does Hive 0.12 run on Hadoop 2.2? if it does, anything special about
installing it or running hive? Will hive start a M/R ApplicationMaster by
itself or should I start one for it?

Many thanks.

Bill

--
Many thanks.

Bill

--
Frank Davidson
Email: ffdavid...@gmail.com
AIM: davidsonff

--
Many thanks.

Bill

--
Frank Davidson
Email: ffdavid...@gmail.com
AIM: davidsonff

Re: [ANNOUNCE] New Hive Committer - Prasad Mujumdar

2013-11-10 Thread Thejas Nair

Congrats Prasad!

On Sun, Nov 10, 2013 at 6:46 PM, Jarek Jarcec Cecho jar...@apache.org wrote:
 Congratulations Prasad, good job!

 Jarcec

 On Sun, Nov 10, 2013 at 06:42:45PM -0800, Carl Steinbach wrote:
 The Apache Hive PMC has voted to make Prasad Mujumdar a committer on the
 Apache Hive Project.

 Please join me in congratulating Prasad!

 Thanks.

 Carl

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: UnitTest did not pass during compile the latest hive code

2013-11-07 Thread Thejas Nair

You can use System.getProperty(test.tmp.dir) instead of '/tmp' . That is
more portable.
It would be wonderful if you can create a new issue in hive jira (
https://issues.apache.org/jira/browse/HIVE) and attach a patch with this
change!



On Tue, Nov 5, 2013 at 9:02 PM, 金杰 hellojin...@gmail.com wrote:

 I modify the test code. And it passed the test.

 See the diff below, Could anyone fix it?


 diff --git
 hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestUseDatabase.java
 hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestUseDatabase.java
 index d164da3..6624849 100644
 ---
 hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestUseDatabase.java
 +++
 hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestUseDatabase.java
 @@ -19,6 +19,7 @@
  package org.apache.hcatalog.cli;

  import java.io.IOException;
 +import java.io.File;

  import junit.framework.TestCase;

 @@ -63,7 +64,10 @@ public void testAlterTablePass() throws IOException,
 CommandNeedRetryException {

  CommandProcessorResponse response;

 -response = hcatDriver.run(alter table  + tblName +  add partition
 (b='2') location '/tmp');
 +File tmp = new File(System.getProperty(java.io.tmpdir) +
 /hive-junit-test- + System.nanoTime());
 +tmp.mkdir();
 +tmp.deleteOnExit();
 +response = hcatDriver.run(alter table  + tblName +  add partition
 (b='2') location ' + tmp.getAbsolutePath() + ');
  assertEquals(0, response.getResponseCode());
  assertNull(response.getErrorMessage());

 diff --git
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestUseDatabase.java
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestUseDatabase.java
 index f362b69..e0247ad 100644
 ---
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestUseDatabase.java
 +++
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestUseDatabase.java
 @@ -19,6 +19,7 @@
  package org.apache.hive.hcatalog.cli;

  import java.io.IOException;
 +import java.io.File;

  import junit.framework.TestCase;

 @@ -61,7 +62,10 @@ public void testAlterTablePass() throws IOException,
 CommandNeedRetryException {

  CommandProcessorResponse response;

 -response = hcatDriver.run(alter table  + tblName +  add partition
 (b='2') location '/tmp');
 +File tmp = new File(System.getProperty(java.io.tmpdir) +
 /hive-junit-test- + System.nanoTime());
 +tmp.mkdir();
 +tmp.deleteOnExit();
 +response = hcatDriver.run(alter table  + tblName +  add partition
 (b='2') location ' + tmp.getAbsolutePath() + ');
  assertEquals(0, response.getResponseCode());
  assertNull(response.getErrorMessage());



 Best Regards
 金杰 (Jay Jin)


 On Wed, Nov 6, 2013 at 10:51 AM, 金杰 hellojin...@gmail.com wrote:

 Thanks Tim  Thejas

 I trying to compile the latest code because I want to learn HIVE code.

 I have compiled HIVE successfully.

 But still have problem in running the tests.

 The test

 ./hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestUseDatabase.java
 failed to pass the junit test.

 we have code below in the test code:

 CommandProcessorResponse response;

 response = hcatDriver.run(alter table  + tblName +  add partition
 (b='2') location '/tmp');
 assertEquals(0, response.getResponseCode());
 assertNull(response.getErrorMessage());

 The code is trying to read /tmp directory. But, in my /tmp directory,
 there is a file name fcitx-socket-:0.
 Accroding to this issue https://issues.apache.org/jira/browse/HADOOP-7945 
 hadoop
 did not allow : in filename.
 The code print following message:

 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask.
 MetaException(message:java.lang.IllegalArgumentException:
 java.net.URISyntaxException: Relative path in absolute URI: fcitx-socket-:0)

 I'm trying to fix it.


 Best Regards
 金杰 (Jay Jin)


 On Wed, Nov 6, 2013 at 4:58 AM, Thejas Nair the...@hortonworks.comwrote:

 The new instructions for using maven are here -
 https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ
 I have updated the
 https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide with
 link to above document. But it still needs cleanup.


 On Tue, Nov 5, 2013 at 7:46 AM, Tim Chou timchou@gmail.com wrote:
  Hi Jie,
 
  Can you compile HIVE successfully now? You need to modify some settings
  according to your error information.
  Maybe you can use the release version to avoid the error.
 
  Tim
 
 
  2013/11/5 金杰 hellojin...@gmail.com
 
  I got it.
 
  I need to run mvn install -DskipTests before I run mvn install
 
  Are there any documents that I can follow to help me compile or
 reading
  hive code?
  The documents on
  https://cwiki.apache.org/confluence/display/Hive/DeveloperGuideseems to 
  be
  outdated.
 
 
 
 
  Best Regards
  金杰 (Jay Jin)
 
 
  On Tue, Nov 5, 2013 at 9:31 PM, 金杰 hellojin...@gmail.com wrote:
 
  Hi, All
 
  When I try to compile the latest code of hive using mvn install
  I got these messages. How to pass these unit

Re: [ANNOUNCE] New Hive Committer - Xuefu Zhang

2013-11-04 Thread Thejas Nair

Congrats Xuefu!

On Sun, Nov 3, 2013 at 11:11 PM, Mohammad Islam misla...@yahoo.com wrote:
 Congrats Xuefu!

 --Mohammad


 On Sunday, November 3, 2013 9:07 PM, hsubramani...@hortonworks.com
 hsubramani...@hortonworks.com wrote:
 Congrats Xuefu!

 Thanks,
 Hari

 On Nov 3, 2013, at 8:28 PM, Gunther Hagleitner ghagleit...@hortonworks.com
 wrote:

 Congrats Xuefu!

 Gunther.


 On Sun, Nov 3, 2013 at 8:23 PM, Lefty Leverenz
 leftylever...@gmail.comwrote:

 Bravo Xuefu!


 -- Lefty



 On Sun, Nov 3, 2013 at 11:09 PM, Zhang Xiaoyu zhangxiaoyu...@gmail.com

 wrote:


 Congratulations! Xuefu, well deserved!


 Johnny



 On Sun, Nov 3, 2013 at 8:06 PM, Carl Steinbach cwsteinb...@gmail.com

 wrote:


 The Apache Hive PMC has voted to make Xuefu Zhang a committer on the

 Apache Hive project.


 Please join me in congratulating Xuefu!


 Thanks.


 Carl






 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader of
 this message is not the intended recipient, you are hereby notified that any
 printing, copying, dissemination, distribution, disclosure or forwarding of
 this communication is strictly prohibited. If you have received this
 communication in error, please contact the sender immediately and delete it
 from your system. Thank You.



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive 12 with Hadoop 2.x with ORC

2013-10-22 Thread Thejas Nair

protobuf 2.5 upgrade did not get included in hive 0.12 (HIVE-5112).
You might want to apply the protobuf update patch on top of 0.12 to
use it with recent versions of hadoop 2.x . (but i am certain if this
is a protobuf version issue).


On Tue, Oct 22, 2013 at 6:53 AM, Rajesh Balamohan
rajesh.balamo...@gmail.com wrote:
 Hi All,

 When running Hive 12 with Hadoop 2.x with ORC, I get the following error
 while converting a table with text file to ORC format table.  Any help will
 be greatly appreciated

 2013-10-22 06:50:49,563 WARN [main] org.apache.hadoop.mapred.YarnChild:
 Exception running child : java.lang.RuntimeException: Hive Runtime Error
 while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
 Caused by: java.lang.UnsupportedOperationException: This is supposed to be
 overridden by subclasses.
   at
 com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
   at
 org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
   at
 com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
   at
 com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
   at
 org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
   at
 com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
   at
 com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
   at
 org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
   at
 com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
   at
 org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
   at
 org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
   at
 org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
   at 
 org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
   at
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
   at
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
   at
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
   ... 8 more





 --
 ~Rajesh.B

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HS2 ODBC incompatibility

2013-10-21 Thread Thejas Nair

Yes, the current odbc driver source in hive is not compatible with
hive server2. I am not aware of any body working on it.
But you can download odbc driver add on for hive server2, for free
from the hortonworks website -
http://hortonworks.com/download/download-archives/

On Mon, Oct 21, 2013 at 5:06 AM, Haroon Muhammad
muhammad.har...@live.com wrote:
 Hi,

 Source under ODBC seems to be incompatible with HS2's changed RPC thrift
 interface. Are there any plans on getting an updated version out any time
 sooner ?

 Thanks,

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: In Beeline what is the syntax for ALTER TABLE ?

2013-10-21 Thread Thejas Nair

beeline does not have any query syntax of its own, it just sends the query
to hive server2, which uses same code as hive-cli to run the query. ie
query syntax is same as hive cli




On Mon, Oct 21, 2013 at 11:41 AM, Sanjay Subramanian 
sanjay.subraman...@wizecommerce.com wrote:

   Hi guys

  Using
Hive 0.10.0+198 CDH4

  Getting this error for ALTER table command

  jdbc:hive2://dev-thdp5.corp.nextag.com:100 ALTER TABLE
 outpdir_seller_hidden ADD IF NOT EXISTS PARTITION
 (header_date_partition='2013-10-17', header_servername_partition='lu3')
 LOCATION
 '/data/output/impressions/outpdir/2013-10-17/008-131018121904385-oozie-oozi-W/outpdir_seller_hidden/sellerhidden/2013-10-17/lu3';

 Error: Error while processing statement: FAILED: IllegalArgumentException
 URI '' in invalid. Must start with file:// or hdfs://
 (state=42000,code=4)

   Thanks
 Regards

  sanjay



 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Fwd: [ANNOUNCE] Apache Hive 0.12.0 Released

2013-10-15 Thread Thejas Nair

The Apache Hive team is proud to announce the the release of Apache
Hive version 0.12.0. The Apache Hive (TM) data warehouse software
facilitates querying and
managing large datasets residing in distributed storage. Built on top
of Apache Hadoop (TM), it provides:
* Tools to enable easy data extract/transform/load (ETL)
* A mechanism to impose structure on a variety of data formats
* Access to files stored either directly in Apache HDFS (TM) or in
other data storage systems such as Apache HBase (TM)
* Query execution via MapReduce

For Hive release details and downloads, please visit:
http://hive.apache.org/releases.html Hive 0.12.0

Release Notes are available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843

We would like to thank the many contributors who made this release possible.

Regards,
The Apache Hive Team

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[RESULT] Apache Hive 0.12.0 Release Candidate 1

2013-10-14 Thread Thejas Nair

With 3 binding +1's and 2 non-binding +1s and no -1s this vote passes.
Thanks to everybody who voted and gave feedback.
I will start working on publishing the release.

Thanks,
Thejas



On Sun, Oct 13, 2013 at 4:37 PM, Carl Steinbach c...@apache.org wrote:
 +1 (binding)


 Regarding the 3 day deadline for voting, that is what is in the hive
 bylaws. I also see that has been followed in last few releases I
 checked.


 3 days is the minimum length of the voting period, not the maximum.

 Thanks.

 Carl

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Use hive on hadoop 3.0.0

2013-10-14 Thread Thejas Nair

If you are using HIVE-TEZ branch, you also need tez (git clone
g...@github.com:apache/tez.git). Publish the tez artifacts locally (mvn
clean install -DskipTests).
In hive, change build.properties to have
hadoop-0.23.version=Hadoop-3.0.0-SNAPSHOT.  Specifying that using
-Dhadoop-0.23.version in your ant command should also work. You
should also specify -Dhadoop.mr.rev=23

Btw, you can also use branch-2.1-beta branch of hadoop, which results
in version 2.1.2-SNAPSHOT currently.


Thanks,
Thejas


On Sat, Oct 12, 2013 at 4:27 PM, Tim Chou timchou@gmail.com wrote:
 Dear Hivers,

 I am a user of HIVE ( or HIVE-TEZ banch). I want to let hive use on
 Hadoop-3.0.0-SNAPSHOT.

 I have changed hadoop version when I compiled hive. But it still doesn't
 work with Hadoop-3.0.0-SNAPSHOT.

 I don't know how to do in this situation. Can someone help me?

 Thanks,
 Tim


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Why doesn't work?

2013-09-26 Thread Thejas Nair

Can you shares some examples of what does not work ?


On Thu, Sep 26, 2013 at 11:34 AM, Gary Zhao garyz...@gmail.com wrote:
 Hello

 I found something strange. I tried a few queries, in WHERE

 1.  works, returns expected results
 2.  and  doesn't work, returns 0 result
 3.  doesn't work, return 0 result
 4. BETWEEN, syntax error

 Basically, I want to find records between to time stamps that are epoch unix
 timestamps. Is there anything I did wrong?

 Thanks
 Gary


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: User accounts to execute hive queries

2013-09-18 Thread Thejas Nair

You might find my slides on this topic useful -
http://www.slideshare.net/thejasmn/hive-authorization-models

Also linked from last slide  -
https://cwiki.apache.org/confluence/display/HCATALOG/Storage+Based+Authorization

On Tue, Sep 17, 2013 at 11:46 PM, Nitin Pawar nitinpawar...@gmail.com wrote:
 The link I gave in previous mail explains how can you user level
 authorizations in hive.



 On Mon, Sep 16, 2013 at 7:57 PM, shouvanik.hal...@accenture.com wrote:

 Hi Nitin,



 I want it secured.



 Yes, I would like to give specific access to specific users. E.g. “select
 * from” access to some and “add/modify/delete” options to some





 “What kind of security do you have on hdfs? “

 I could not follow this question



 Thanks,

 Shouvanik

 From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
 Sent: Monday, September 16, 2013 6:50 PM
 To: Haldar, Shouvanik
 Cc: user@hive.apache.org
 Subject: Re: User accounts to execute hive queries



 You will need to tell few more things.

 Do you want it secured?

 Do you distinguish users in different categories on what one particular
 user can do or not?

 What kind of security do you have on hdfs?





 It is definitely possible for users to run queries on their own username
 but then you have to take few measures as well.

 which user can do what action. Which user can access what location on hdfs
 etc



 For user management on hive side you can read at
 https://cwiki.apache.org/Hive/languagemanual-authorization.html



 if you do not want to go through the secure way,

 then add all the users to one group and then grant permissions to that
 group on your warehouse directory.



 other way if the table data is not shared then,

 create individual directory for each user on hdfs and give only that user
 access to that directory.


 
 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited.

 Where allowed by local law, electronic communications with Accenture and
 its affiliates, including e-mail and instant messaging (including content),
 may be scanned by our systems for the purposes of information security and
 assessment of internal compliance with Accenture policy.


 __

 www.accenture.com




 --
 Nitin Pawar

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive Committer - Yin Huai

2013-09-04 Thread Thejas Nair

Congrats Yin!
Well deserved! Looking forward to many more contributions from you!



On Tue, Sep 3, 2013 at 11:45 PM, Hari Subramaniyan
hsubramani...@hortonworks.com wrote:
 Congrats !!


 On Tue, Sep 3, 2013 at 11:43 PM, Vaibhav Gumashta
 vgumas...@hortonworks.com wrote:

 Congrats Yin!


 On Tue, Sep 3, 2013 at 11:37 PM, Jarek Jarcec Cecho jar...@apache.org
 wrote:

 Congratulations Yin!

 Jarcec

 On Tue, Sep 03, 2013 at 09:49:55PM -0700, Carl Steinbach wrote:
  The Apache Hive PMC has voted to make Yin Huai a committer on the
  Apache
  Hive project.
 
  Please join me in congratulating Yin!
 
  Thanks.
 
  Carl



 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader of
 this message is not the intended recipient, you are hereby notified that any
 printing, copying, dissemination, distribution, disclosure or forwarding of
 this communication is strictly prohibited. If you have received this
 communication in error, please contact the sender immediately and delete it
 from your system. Thank You.



 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader of
 this message is not the intended recipient, you are hereby notified that any
 printing, copying, dissemination, distribution, disclosure or forwarding of
 this communication is strictly prohibited. If you have received this
 communication in error, please contact the sender immediately and delete it
 from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: ODBC driver for Excel

2013-08-22 Thread Thejas Nair

You can use the hortonworks odbc driver for free, it is available as an
add-on -
http://hortonworks.com/download/download-archives/
-Thejas



On Thu, Aug 22, 2013 at 4:30 PM, Chad Dotzenrod cdotzen...@gmail.comwrote:

 Trying to find an ODBC driver to use with Excel against my cluster that I
 don't have to shell out money for.  Are there options?  I'm using the
 cloudera driver for tableau and it works great!  Just need a similar option
 for excel



 Sent from my iPhone

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [ANNOUNCE] New Hive Committer - Thejas Nair

2013-08-20 Thread Thejas Nair

Thanks everybody!
This is a great honor. I hope to live up to the expectations!

It is great to see the fresh momentum in hive and I am glad to be able to
work with the community on it.
Now I will also do my best to help get the count for jiras in
patch-available state down!



On Tue, Aug 20, 2013 at 9:05 AM, Thiruvel Thirumoolan 
thiru...@yahoo-inc.com wrote:

  Congrats Thejas!

 On Aug 20, 2013, at 8:00 AM, Bill Graham billgra...@gmail.com wrote:

   Congrats Thejas!


 On Tue, Aug 20, 2013 at 7:32 AM, Jarek Jarcec Cecho jar...@apache.orgwrote:

 Congratulations Thejas!

 Jarcec

 On Tue, Aug 20, 2013 at 03:31:48AM -0700, Carl Steinbach wrote:
  The Apache Hive PMC has voted to make Thejas Nair a committer on the
 Apache
  Hive project.
 
  Please join me in congratulating Thejas!




  --
 *Note that I'm no longer using my Yahoo! email address. Please email me
 at billgra...@gmail.com going forward.*



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: regarding Hive Thrift Metastore Server

2013-07-24 Thread Thejas Nair

Without a metastore server, you would need to make access to the mysql
db possible from all hive clients. This is inherently less secure,
because anybody who can run hive cli can modify the mysql db, and
there is not metastore server doing the authorization checks.
If you don't care about security, metastore server is not that
necessary. In the hortonworks distribution (HDP), we have metastore
server running by default, and that is the common use case.

What version of hive are you using ? Try disabling the file system
cache in hive-site.xml in metastore (fs.hdfs.impl.disable.cache=true
and fs.file.impl.disable.cache=true)

On Wed, Jul 24, 2013 at 1:56 PM, Shangzhong zhu shanzh...@gmail.com wrote:
 Hi all,

 Our current setting of Hive is:

 Hive Thrift server - MySQL metastore.

 All requests to MySQL metastore is going through the Thrift server. However,
 we have been seeing issues with this setting. Thrift server once a while
 gets stuck with TTransport timeout errors or even OOM.

 Seems removing the Hive Thrift server, and let all clients directly hit
 MySQL is a better option.

 Just want to check with the community that, is everyone directly using MySQL
 metastore without Thrift server, and is it the setting that Hive developer
 team recommend.

 Thanks,
 Shanzhong

Re: [ANNOUNCE] Apache Hive 0.11.0 Released

2013-05-17 Thread Thejas Nair

Thanks to all the contributors who helped create this new release!
And special thanks to the release managers  Ashutosh Chauhan and Owen
O'Malley - for driving this release.



On Thu, May 16, 2013 at 2:19 PM, Owen O'Malley omal...@apache.org wrote:

 The Apache Hive team is proud to announce the the release of Apache
 Hive version 0.11.0.

 The Apache Hive data warehouse software facilitates querying and
 managing large datasets residing in distributed storage. Built on top
 of Apache Hadoop, it provides:

 * Tools to enable easy data extract/transform/load (ETL)

 * A mechanism to impose structure on a variety of data formats

 * Access to files stored either directly in Apache HDFS or in other
   data storage systems such as Apache HBase

 * Query execution via MapReduce

 For Hive release details and downloads, please visit:
 http://hive.apache.org/releases.html

 Hive 0.11.0 Release Notes are available here:


 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12323587styleName=HtmlprojectId=12310843

 We would like to thank the many contributors who made this release
 possible.

 Regards,

 The Apache Hive Team

requesting write access to hive confluence wiki

2012-04-30 Thread Thejas Nair


Hi,
I would like to update/fix some sections in the hive confluence wiki.
Please grant write access. My user name is thejas.

Thanks,
Thejas

65 matches

Mail list logo