Re: Review Request 66485: HIVE-19124 implement a basic major compactor for MM tables

2018-04-22 Thread Gopal V

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66485/#review201717
---




ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
Lines 353 (patched)


Add a timestamp to the tmp-table and fail-retry if it already exists.

Dropping it might make it harder to debug this.



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
Lines 1236 (patched)


Add comment about not needing locks because these are insert-only tables 
and the base writer doesn't need locks anyway.



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
Lines 1257 (patched)


This + the next look looks a bit odd


- Gopal V


On April 20, 2018, 11:15 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66485/
> ---
> 
> (Updated April 20, 2018, 11:15 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 536c7b427f 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  82ba775286 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 9cb2ff1015 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java c8cb8a40b4 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java dde20ed56e 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> b1c2288d01 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 22765b8e63 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java fe0aaa4ff5 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java 
> c053860b36 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  cb1d40a4a8 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
>  7b02865e18 
>   
> storage-api/src/java/org/apache/hadoop/hive/common/ValidReaderWriteIdList.java
>  107ea9028a 
> 
> 
> Diff: https://reviews.apache.org/r/66485/diff/7/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



[GitHub] hive pull request #334: HIVE-19219: Hive replicated database is out of sync ...

2018-04-22 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/334


---


[jira] [Created] (HIVE-19271) TestMiniLlapLocalCliDriver default_constraint and check_constraint failing

2018-04-22 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-19271:
--

 Summary: TestMiniLlapLocalCliDriver default_constraint and 
check_constraint failing
 Key: HIVE-19271
 URL: https://issues.apache.org/jira/browse/HIVE-19271
 Project: Hive
  Issue Type: Test
Reporter: Vineet Garg
Assignee: Vineet Garg






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: how to extract metadata of hive tables in speed

2018-04-22 Thread 侯宗田
Can anyone give me some suggestions? I have been stuck in this problem for 
several days. Need help!!
> 在 2018年4月22日,下午9:38,侯宗田  写道:
> 
> 
> Hi,
> 
> I am writing a application which needs the metastore about hive tables. I 
> have used webhcat to get the information about tables and process them. But a 
> simple request takes over eight seconds to respond on localhost. Why is this 
> so slow, and how can I fix it or is there other way I can extract the 
> metadata in C?
> 
> $ time curl -s 
> 'http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean
>  
> '
> {"columns": 
>  [{"name":"id","type":"int"}],
>  "database":"default",
>  "table":"haha"}
> 
> real0m8.400s
> user0m0.053s
> sys 0m0.019s
> it seems to run a hcat.py, and it create a bunch of things then clear them, 
> it takes very long time, does anyone have some ideas about it?? Any 
> suggestions will be very appreciated!
> 
> $hcat.py -e "use default; desc haha; "
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings 
>  for an explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 18/04/21 16:38:13 INFO conf.HiveConf: Found configuration file 
> file:/usr/local/hive/conf/hive-site.xml
> 18/04/21 16:38:15 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: 
> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668
> 18/04/21 16:38:16 INFO session.SessionState: Created local directory: 
> /tmp/hive/java/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668
> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: 
> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668/_tmp_space.db
> 18/04/21 16:38:16 INFO ql.Driver: Compiling 
> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62):
>  use default
> 18/04/21 16:38:17 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implementation class:org.apache.hadoop.hive.metastore.ObjectStore
> 18/04/21 16:38:17 INFO metastore.ObjectStore: ObjectStore, initialize called
> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 18/04/21 16:38:18 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL, 
> underlying DB is MYSQL
> 18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore
> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added admin role in metastore
> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added public role in metastore
> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: No user is added in admin 
> role, since config is empty
> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_all_functions
> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda  
> ip=unknown-ip-addr  cmd=get_all_functions
> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default
> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda  
> ip=unknown-ip-addr  cmd=get_database: default
> 18/04/21 16:38:20 INFO ql.Driver: Semantic Analysis Completed
> 18/04/21 16:38:20 INFO ql.Driver: Returning Hive schema: 
> Schema(fieldSchemas:null, properties:null)
> 18/04/21 16:38:20 INFO ql.Driver: Completed compiling 
> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62);
>  Time taken: 3.936 seconds
> 18/04/21 16:38:20 INFO ql.Driver: Concurrency mode is disabled, not creating 
> a lock manager
> 18/04/21 16:38:20 INFO ql.Driver: Executing 
> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62):
>  use default
> 18/04/21 16:38:20 INFO sqlstd.SQLStdHiveAccessController: Created 
> SQLStdHiveAccessController for session context : HiveAuthzSessionContext 
> [sessionString=05096382-f9b6-4dae-aee2-dfa6750c0668, clientType=HIVECLI]
> 18/04/21 16:38:20 WARN session.SessionState: METASTORE_FILTER_HOOK will be 
> ignored, since hive.security.authorization.manager is set to instance of 
> HiveAuthorizerFactory.
> 18/04/21 16:38:20 INFO hive.metastore: Mestastore configuration 
> hive.metastore.filter.hook 

[jira] [Created] (HIVE-19270) TestAcidOnTez tests are failing

2018-04-22 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-19270:
--

 Summary: TestAcidOnTez tests are failing
 Key: HIVE-19270
 URL: https://issues.apache.org/jira/browse/HIVE-19270
 Project: Hive
  Issue Type: Sub-task
Reporter: Vineet Garg


Following tests are failing:
* testCtasTezUnion
* testNonStandardConversion01
* testAcidInsertWithRemoveUnion

All of them have the similar failure:
{noformat}
Actual line 0 ac: {"writeid":1,"bucketid":536870913,"rowid":1} 1 2 
file:/home/hiveptest/35.193.47.6-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.TestAcidOnTez-1524409020904/warehouse/t/delta_001_001_0001/bucket_0
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #335: HIVE-19267 : Create/Replicate ACID Write event

2018-04-22 Thread maheshk114
GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/335

HIVE-19267 : Create/Replicate ACID Write event

Replicate ACID write Events
Create new EVENT_WRITE event with related message format to log the write 
operations with in a txn along with data associated.
Log this event when perform any writes (insert into, insert overwrite, load 
table, delete, update, merge, truncate) on table/partition.
If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple 
partitions, then need to log one event per partition.
DbNotificationListener should log this type of event to special metastore 
table named "MTxnWriteNotificationLog".
This table should maintain a map of txn ID against list of 
tables/partitions written by given txn.
The entry for a given txn should be removed by the cleaner thread that 
removes the expired events from EventNotificationTable.
Replicate Commit Txn operation (with writes)
Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions 
modified within the txn.

Source warehouse:

This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" 
metastore table to consolidate the list of tables/partitions modified within 
this txn scope.
Based on the list of tables/partitions modified and table Write ID, need to 
compute the list of delta files added by this txn.
Repl dump should read this message and dump the metadata and delta files 
list.
Target warehouse:

Ensure snapshot isolation at target for on-going read txns which shouldn't 
view the data replicated from committed txn. (Ensured with open and allocate 
write ID events).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-92690

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/335.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #335


commit 4cf06e4cd921028cd02d71ba022385986e0227ee
Author: Mahesh Kumar Behera 
Date:   2018-04-18T06:07:38Z

HIVE-19267 : Create/Replicate ACID Write event




---


[jira] [Created] (HIVE-19269) Vectorization: Turn On by Default

2018-04-22 Thread Matt McCline (JIRA)
Matt McCline created HIVE-19269:
---

 Summary: Vectorization: Turn On by Default
 Key: HIVE-19269
 URL: https://issues.apache.org/jira/browse/HIVE-19269
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline
 Fix For: 3.0.0, 3.1.0


Reflect that our most expected Hive deployment will be using vectorization and 
change the default of hive.vectorized.execution.enabled to true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19268) Create and Replicate ACID write operations

2018-04-22 Thread mahesh kumar behera (JIRA)
mahesh kumar behera created HIVE-19268:
--

 Summary: Create and Replicate ACID write operations
 Key: HIVE-19268
 URL: https://issues.apache.org/jira/browse/HIVE-19268
 Project: Hive
  Issue Type: Task
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19267) Create/Replicate ACID Write event

2018-04-22 Thread mahesh kumar behera (JIRA)
mahesh kumar behera created HIVE-19267:
--

 Summary: Create/Replicate ACID Write event
 Key: HIVE-19267
 URL: https://issues.apache.org/jira/browse/HIVE-19267
 Project: Hive
  Issue Type: Sub-task
  Components: repl, Transactions
Affects Versions: 3.0.0
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera
 Fix For: 3.0.0


*EVENT_ALLOCATE_WRITE_ID*
*Source Warehouse:*
 * Create new event type EVENT_ALLOCATE_WRITE_ID with related message format 
etc.

 * Capture this event when allocate a table write ID from the sequence table by 
ACID operation.

 * Repl dump should read this event from EventNotificationTable and dump the 
message.

*Target Warehouse:*
 * Repl load should read the event from the dump and get the message.

 * Validate if source txn ID from the event is there in the source-target txn 
ID map. If not there, just noop the event.

 * If valid, then Allocate table write ID from sequence table

*Extend listener notify event API to add two new parameter , dbconn and 
sqlgenerator to add the events to notification_log table within the same 
transaction* 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


how to extract metadata of hive tables in speed

2018-04-22 Thread 侯宗田

Hi,

I am writing a application which needs the metastore about hive tables. I have 
used webhcat to get the information about tables and process them. But a simple 
request takes over eight seconds to respond on localhost. Why is this so slow, 
and how can I fix it or is there other way I can extract the metadata in C?

$ time curl -s 
'http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean
 
'
{"columns": 
  [{"name":"id","type":"int"}],
  "database":"default",
  "table":"haha"}

real0m8.400s
user0m0.053s
sys 0m0.019s
it seems to run a hcat.py, and it create a bunch of things then clear them, it 
takes very long time, does anyone have some ideas about it?? Any suggestions 
will be very appreciated!

$hcat.py -e "use default; desc haha; "
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings 
 for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/04/21 16:38:13 INFO conf.HiveConf: Found configuration file 
file:/usr/local/hive/conf/hive-site.xml
18/04/21 16:38:15 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: 
/tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668
18/04/21 16:38:16 INFO session.SessionState: Created local directory: 
/tmp/hive/java/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668
18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: 
/tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668/_tmp_space.db
18/04/21 16:38:16 INFO ql.Driver: Compiling 
command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62): 
use default
18/04/21 16:38:17 INFO metastore.HiveMetaStore: 0: Opening raw store with 
implementation class:org.apache.hadoop.hive.metastore.ObjectStore
18/04/21 16:38:17 INFO metastore.ObjectStore: ObjectStore, initialize called
18/04/21 16:38:18 INFO DataNucleus.Persistence: Property 
hive.metastore.integral.jdo.pushdown unknown - will be ignored
18/04/21 16:38:18 INFO DataNucleus.Persistence: Property 
datanucleus.cache.level2 unknown - will be ignored
18/04/21 16:38:18 INFO metastore.ObjectStore: Setting MetaStore object pin 
classes with 
hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL, 
underlying DB is MYSQL
18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore
18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added admin role in metastore
18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added public role in metastore
18/04/21 16:38:20 INFO metastore.HiveMetaStore: No user is added in admin role, 
since config is empty
18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_all_functions
18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
ip=unknown-ip-addr  cmd=get_all_functions
18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default
18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
ip=unknown-ip-addr  cmd=get_database: default
18/04/21 16:38:20 INFO ql.Driver: Semantic Analysis Completed
18/04/21 16:38:20 INFO ql.Driver: Returning Hive schema: 
Schema(fieldSchemas:null, properties:null)
18/04/21 16:38:20 INFO ql.Driver: Completed compiling 
command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62); 
Time taken: 3.936 seconds
18/04/21 16:38:20 INFO ql.Driver: Concurrency mode is disabled, not creating a 
lock manager
18/04/21 16:38:20 INFO ql.Driver: Executing 
command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62): 
use default
18/04/21 16:38:20 INFO sqlstd.SQLStdHiveAccessController: Created 
SQLStdHiveAccessController for session context : HiveAuthzSessionContext 
[sessionString=05096382-f9b6-4dae-aee2-dfa6750c0668, clientType=HIVECLI]
18/04/21 16:38:20 WARN session.SessionState: METASTORE_FILTER_HOOK will be 
ignored, since hive.security.authorization.manager is set to instance of 
HiveAuthorizerFactory.
18/04/21 16:38:20 INFO hive.metastore: Mestastore configuration 
hive.metastore.filter.hook changed from 
org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to 
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Cleaning up thread local 
RawStore...
18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda

Re: Does Hive support Hbase-synced partitioned tables?

2018-04-22 Thread Oleksiy S
Any updates?

On Fri, Apr 20, 2018 at 10:54 AM, Oleksiy S 
wrote:

> Hi all.
>
> I can create following table
>
> create table hbase_partitioned(doc_id STRING, EmployeeID Int, FirstName
> String, Designation  String, Salary Int) PARTITIONED BY (Department String)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH
> SERDEPROPERTIES ("hbase.columns.mapping" = ":key,boolsCF:EmployeeID,
> intsCF:FirstName,intsCF:Designation,intsCF:Salary") TBLPROPERTIES("
> hbase.table.name" = "hbase_partitioned");
>
>
> But when I want to insert data, I have an exception. Is it expected
> behavior?
>
> INSERT INTO TABLE hbase_partitioned PARTITION(department='A') values
> ('1', 1, 'John Connor', 'New York', 2300),
> ('2', 2, 'Max Plank', 'Las Vegas', 1300),
> ('3', 3, 'Arni Shwarz', 'Los Angelos', 7700),
> ('4', 4, 'Sarah Connor', 'Oakland', 9700);
>
>
>
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in
> the future versions. Consider using a different execution engine (i.e.
> spark, tez) or using Hive 1.X releases.
> Query ID = mapr_20180420074356_b13d8652-1ff6-4fe1-975c-7318db6037de
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.IllegalArgumentException: Must specify table name
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(
> FileSinkOperator.java:1136)
> at org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(
> HiveOutputFormatImpl.java:67)
> at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(
> JobSubmitter.java:271)
> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
> JobSubmitter.java:142)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1595)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1595)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(
> JobClient.java:570)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(
> ExecDriver.java:434)
> at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(
> MapRedTask.java:138)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
> TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2074)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1745)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1454)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1162)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(
> CliDriver.java:238)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:186)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:405)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:791)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:729)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:652)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:647)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException:
> Must specify table name
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(
> FileSinkOperator.java:1158)
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(
> FileSinkOperator.java:1133)
> ... 38 more
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.
> setConf(TableOutputFormat.java:191)
> at org.apache.hive.common.util.ReflectionUtil.setConf(
> ReflectionUtil.java:101)
> at org.apache.hive.common.util.ReflectionUtil.newInstance(
> ReflectionUtil.java:87)
> at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(
> 

[jira] [Created] (HIVE-19266) Use UDFs in Hive-On-Spark complains Unable to find class Exception regarding kryo

2018-04-22 Thread Di Zhu (JIRA)
Di Zhu created HIVE-19266:
-

 Summary: Use UDFs in Hive-On-Spark complains Unable to find class 
Exception regarding kryo
 Key: HIVE-19266
 URL: https://issues.apache.org/jira/browse/HIVE-19266
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.3.2
Reporter: Di Zhu


For a SQL with UDF as below in Hive:

 

```

set hive.execution.engine=spark;
add jar viewfs:///path_to_the_jar/aaa.jar;
create temporary function func_name AS 'com.abc.ClassName';

select func_name(col_a) from table_name limit 100;

```

it complains the following error.

 

```

ERROR : Job failed with java.lang.ClassNotFoundException: com.abc.ClassName
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
com.abc.ClassName
Serialization trace:
genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
colList (org.apache.hadoop.hive.ql.plan.SelectDesc)
conf (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
left (org.apache.commons.lang3.tuple.ImmutablePair)
edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
 at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
 at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:181)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at 

Re: Review Request 66503: HIVE-19126: CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-22 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66503/#review201706
---




standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
Lines 52 (patched)


standalone-metastore shall not depend on ql.


- Daniel Dai


On April 16, 2018, 9:42 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66503/
> ---
> 
> (Updated April 16, 2018, 9:42 p.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-19126
> https://issues.apache.org/jira/browse/HIVE-19126
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-19126
> 
> 
> Diffs
> -
> 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
>  6f4ec6f1ea 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
>  2f7fa24558 
>   
> llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java
>  0bbaf7e459 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
>  1ce86bbdba 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
>  89b400697b 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  f007261daf 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/SizeValidator.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  d451f966b0 
> 
> 
> Diff: https://reviews.apache.org/r/66503/diff/6/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>