[jira] [Created] (HIVE-20844) Cache Instances of CacheManager in DummyTxnManager

2018-10-31 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20844:
--

 Summary: Cache Instances of CacheManager in DummyTxnManager
 Key: HIVE-20844
 URL: https://issues.apache.org/jira/browse/HIVE-20844
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Locking
Affects Versions: 3.1.0, 2.3.2, 4.0.0
Reporter: BELUGA BEHR


I noticed that the {{DummyTxnManager}} class instantiates quite a few instances 
of {{ZooKeeperHiveLockManager}}. The ZooKeeper LM creates a connection to ZK 
for each instance created.  It also does some initialization steps that are 
almost always just noise and pressure on ZooKeeper because it has already been 
initialized and the steps are therefore NOOPs.  {{ZooKeeperHiveLockManager}} 
should be a singleton class with one long-lived connection to the ZooKeeper 
service. Perhaps the {{HiveLockManager}} interface could have a 
{{isSingleton()}} method which indicates that the LM should only be 
instantiated once and cached for subsequent sessions.

 
{code:java}
2018-05-14 22:45:30,574  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1252389]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
2018-05-14 22:51:27,865  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1252671]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
2018-05-14 22:51:37,552  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1252686]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
2018-05-14 22:51:49,046  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1252736]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
2018-05-14 22:51:50,664  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1252742]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
2018-05-14 23:00:54,314  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1253479]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
2018-05-14 23:17:26,867  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1254180]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
2018-05-14 23:24:25,426  INFO  
org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager: 
[HiveServer2-Background-Pool: Thread-1255493]: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
{code}
{code:java|title=DummyTxnManager.java}
@Override
  public HiveLockManager getLockManager() throws LockException {
if (lockMgr == null) {
  boolean supportConcurrency =
  conf.getBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY);
  if (supportConcurrency) {
String lockMgrName =
conf.getVar(HiveConf.ConfVars.HIVE_LOCK_MANAGER);
if ((lockMgrName == null) || (lockMgrName.isEmpty())) {
  throw new LockException(ErrorMsg.LOCKMGR_NOT_SPECIFIED.getMsg());
}

try {
 // CACHE LM HERE
  LOG.info("Creating lock manager of type " + lockMgrName);
  lockMgr = (HiveLockManager)ReflectionUtils.newInstance(
  conf.getClassByName(lockMgrName), conf);
  lockManagerCtx = new HiveLockManagerCtx(conf);
  lockMgr.setContext(lockManagerCtx);
} catch (Exception e) {
...
{code}
[https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockManager.java]

 {code:java|title=ZooKeeperHiveLockManager Initialization}
try {
  curatorFramework = CuratorFrameworkSingleton.getInstance(conf);
  parent = conf.getVar(HiveConf.ConfVars.HIVE_ZOOKEEPER_NAMESPACE);
  try{
curatorFramework.create().withMode(CreateMode.PERSISTENT).forPath("/" + 
 parent, new byte[0]);
  } catch (Exception e) {
// ignore if the parent already exists
if (!(e instanceof KeeperException) || ((KeeperException)e).code() != 
KeeperException.Code.NODEEXISTS) {
  LOG.warn("Unexpected ZK exception when creating parent node /" + 
parent, e);
}
  }
{code}
 
https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java#L96-L106



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20845) Fix TestJdbcWithDBTokenStoreNoDoAs flakiness

2018-10-31 Thread Peter Vary (JIRA)
Peter Vary created HIVE-20845:
-

 Summary: Fix TestJdbcWithDBTokenStoreNoDoAs flakiness
 Key: HIVE-20845
 URL: https://issues.apache.org/jira/browse/HIVE-20845
 Project: Hive
  Issue Type: Test
Reporter: Peter Vary
Assignee: Peter Vary


Previously did a dirty fix for TestJdbcWithDBTokenStoreNoDoAs and 
TestJdbcWithDBTokenStore
Found out the issue is that we do not wait enough for HS2 to come up.
Need to fix in MiniHS2.waitForStartup()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20847) Review of NullScanCode

2018-10-31 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20847:
--

 Summary: Review of NullScanCode
 Key: HIVE-20847
 URL: https://issues.apache.org/jira/browse/HIVE-20847
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Affects Versions: 3.1.0, 4.0.0
Reporter: BELUGA BEHR


What got me looking at this class was the verboseness of some of the logging.  
I would like to request that we DEBUG the logging since this level of detail 
means nothing to a cluster admin.

Also... this {{contains}} call would be better applied onto a {{HashSet}} 
instead of an {{ArrayList}}.

{code:java|title=NullScanTaskDispatcher.java}
  private void processAlias(MapWork work, Path path, ArrayList 
aliasesAffected, ArrayList aliases) {
// the aliases that are allowed to map to a null scan.
ArrayList allowed = new ArrayList();
for (String alias : aliasesAffected) {
  if (aliases.contains(alias)) {
allowed.add(alias);
  }
}
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Apache Hive 3.1.1 Release Candidate 0

2018-10-31 Thread Zoltan Haindrich
I've done the following and not seen any problems:

* verified asc/sha
* built from sources
* installed it on a vagrant box with hadoop-3.1.1/tez-0.9.1 ; run some queries 
without issues

+1 

On 30 October 2018 16:46:07 GMT-07:00, Thejas Nair  
wrote:
> * Verified signatures and checksums.
> * Reviewed git rc tag contents
> * Build src tar.gz
> * Untarred bin.tar.gz and ran queries
>
>+1 to the release
>
>
>On Wed, Oct 24, 2018 at 4:59 PM Daniel Dai 
>wrote:
>>
>> Apache Hive 3.1.1 Release Candidate 0 is available here:
>>
>> http://people.apache.org/~daijy/apache-hive-3.1.1-rc-0/
>>
>> Maven artifacts are available here:
>>
>> https://repository.apache.org/content/repositories/orgapachehive-1092
>>
>> Source tag for RCN is at:
>>
>> https://github.com/apache/hive/tree/release-3.1.1-rc0
>>
>> Voting will conclude in 72 hours.
>>
>> Hive PMC Members: Please test and vote.
>>
>> Thanks.
>>

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [ANNOUNCE] New PMC Member : Zoltan

2018-10-31 Thread Vihang Karajgaonkar
Congrats Zoltan!

On Wed, Oct 31, 2018 at 1:10 AM, Peter Vary 
wrote:

> Congratulations Zoltan!
> Well deserved!
>
> > On Oct 31, 2018, at 06:16, Prasanth Jayachandran <
> pjayachand...@hortonworks.com> wrote:
> >
> > Congratulations!
> >
> >> On Oct 30, 2018, at 10:15 PM, Deepak Jaiswal 
> wrote:
> >>
> >> Congratulations Zoltan!
> >>
> >> On 10/30/18, 10:08 PM, "Ashutosh Chauhan" 
> wrote:
> >>
> >>Hello Hive community,
> >>
> >>   I'm pleased to announce that Zoltan Haindrich has accepted the Apache
> >>   Hive PMC's invitation, and is our newest PMC member. Many thanks to
> >>   Zoltan for all of his hard work.
> >>
> >>   Please join me in congratulating Zoltan!
> >>
> >>   Thanks,
> >>   Ashutosh
> >>
> >>
> >
>
>


[RESULT][VOTE] Apache Hive 3.1.1 Release Candidate 0

2018-10-31 Thread Daniel Dai
[RESULT][VOTE] Apache Hive 3.1.1 Release Candidate 0

With three +1’s (Thejas, Zoltan, Daniel) and no -1’s the vote passes.

I will wrap up the release and send announcement once done.


Re: Ptests not working

2018-10-31 Thread Vihang Karajgaonkar
Created https://issues.apache.org/jira/browse/INFRA-17193

On Wed, Oct 31, 2018 at 1:30 AM, Peter Vary 
wrote:

> Likely not ptest issue, but restarted ptest anyway.
>
> It says:
> (pending—Waiting for next available executor on Hadoop <
> https://builds.apache.org/label/Hadoop>)
>
> So maybe Apache infra issue?
>
> > On Oct 31, 2018, at 01:20, Jesus Camacho Rodriguez  hortonworks.com> wrote:
> >
> > It seems we are not getting executors… Is there any known (infra) issue?
> >
> > Thanks,
> > Jesús
>
>


Re: Review Request 69107: HIVE-20512

2018-10-31 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69107/#review210230
---




ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMergeFileRecordHandler.java
Line 93 (original), 93 (patched)


I'm not sure, but shouldnt we call incrementRowNumber from here as well?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java
Lines 67 (patched)


I think it should be deamon. What do you think?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java
Line 62 (original), 83 (patched)


Probably these three rows can move to the MemoryInfoLogger class  (as 
static method), as from now on its his responsibility.

Something like MemoryInfoLogger.start() or schedule



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
Line 574 (original), 572 (patched)


Is this correct? In batch processing increasing it with only one?


- Antal Sinkovits


On okt. 26, 2018, 5:13 du, Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69107/
> ---
> 
> (Updated okt. 26, 2018, 5:13 du)
> 
> 
> Review request for hive, Antal Sinkovits, Sahil Takiar, and Vihang 
> Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve record and memory usage logging in SparkRecordHandler
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java 
> 88dd12c05ade417aca4cdaece4448d31d4e1d65f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMergeFileRecordHandler.java
>  8880bb604e088755dcfb0bcb39689702fab0cb77 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java 
> cb5bd7ada2d5ad4f1f654cf80ddaf4504be5d035 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
>  20e7ea0f4e8d4ff79dddeaab0406fc7350d22bd7 
> 
> 
> Diff: https://reviews.apache.org/r/69107/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



[jira] [Created] (HIVE-20846) beeline does not honor driver mandated key/value pairs for uid & pwd.

2018-10-31 Thread Venu Yanamandra (JIRA)
Venu Yanamandra created HIVE-20846:
--

 Summary: beeline does not honor driver mandated key/value pairs 
for uid & pwd.
 Key: HIVE-20846
 URL: https://issues.apache.org/jira/browse/HIVE-20846
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 1.1.0
Reporter: Venu Yanamandra


Please confirm and correct if necessary the expectation that beeline program 
should honor the driver's key/value pairs for UID/PWD.


Steps to reproduce the issue -
1. Download latest Impala JDBC driver from - 
'https://www.cloudera.com/downloads/connectors/impala/jdbc/2-6-4.html'
2. Extract, set & export HADOOP_CLASSPATH to: $(hadoop 
classpath):
3. As per the documentation of the driver, we should be setting up: 
AuthMech=3;UID=;PWD= for connecting to impala 
jdbc using LDAP. [1]
4. However, beeline is unable to successfully connect.
5. Impala coordinator logs displays the error [2] below.
6. If we however use "user=;password=" in the 
connection string, it works fine. [3]
7. However, the expectation is that beeline should honor driver's kv pairs for 
uid/pwd.


Regards,
Venu Yanamandra
{noformat}
[1]:
beeline -d "com.cloudera.impala.jdbc41.Driver" -u 
"jdbc:impala://nightly512-3.vpc.cloudera.com:21051/default;SSL=1;SSLTrustStore=/etc/cdep-ssl-conf/CA_STANDARD/truststore.jks;SSLTrustStorePwd=cloudera;AllowSelfSignedCerts=1;CAIssuedCertNamesMismatch=1;AuthMech=3;UID=test1;PWD=Password1;LogLevel=6;LogPath=/root/ijdbc/drvlog"

[2]:
E1024 04:34:52.694922 29833 authentication.cc:159] SASL message (LDAP): 
All-whitespace username.
I1024 04:34:52.696190 29833 thrift-util.cc:123] TThreadPoolServer: Caught 
TException: SASL(-1): generic failure: All-whitespace username.

[3]:
beeline -d "com.cloudera.impala.jdbc41.Driver" -u 
"jdbc:impala://nightly512-3.vpc.cloudera.com:21051/default;SSL=1;SSLTrustStore=/etc/cdep-ssl-conf/CA_STANDARD/truststore.jks;SSLTrustStorePwd=cloudera;AllowSelfSignedCerts=1;CAIssuedCertNamesMismatch=1;AuthMech=3;user=test1;password=Password1;UID=test1;PWD=Password1;LogLevel=6;LogPath=/root/ijdbc/drvlog"
 -e "show databases;"
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.

2018-10-31 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20848:
-

 Summary: After setting UpdateInputAccessTimeHook query fail with 
Table Not Found.
 Key: HIVE-20848
 URL: https://issues.apache.org/jira/browse/HIVE-20848
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


{code}
 select from_unixtime(1540495168); 
 set 
hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;
 select from_unixtime(1540495168); 
{code}
the second select fail with following exception
{code}
ERROR ql.Driver: FAILED: Hive Internal Error: 
org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found 
_dummy_table)
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
_dummy_table
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155)
at 
org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
at 
org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at 
org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20849) Review of ConstantPropagateProcFactory

2018-10-31 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created HIVE-20849:
--

 Summary: Review of ConstantPropagateProcFactory
 Key: HIVE-20849
 URL: https://issues.apache.org/jira/browse/HIVE-20849
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 3.1.0, 4.0.0
Reporter: BELUGA BEHR
 Attachments: HIVE-20849.1.patch

I was looking at this class because it blasts a lot of useless (to an admin) 
information to the logs.  Especially if the table has a lot of columns, I see 
big blocks of logging that are meaningless to me.  I request that the logging 
is toned down to debug, and some other improvements to the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[VOTE] Apache Hive 2.3.4 Release Candidate 0

2018-10-31 Thread Daniel Dai
Apache Hive 2.3.4 Release Candidate 0 is available here:

http://people.apache.org/~daijy/apache-hive-2.3.4-rc-0/

Maven artifacts are available here:

https://repository.apache.org/content/repositories/orgapachehive-1093

Source tag for RCN is at:

https://github.com/apache/hive/tree/release-2.3.4-rc0

Voting will conclude in 72 hours.

Hive PMC Members: Please test and vote.

Thanks.




[jira] [Created] (HIVE-20850) Add rule to extract case conditional from projections

2018-10-31 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-20850:
---

 Summary: Add rule to extract case conditional from projections
 Key: HIVE-20850
 URL: https://issues.apache.org/jira/browse/HIVE-20850
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


noticed by [~gopalv]: If there is a project which could be only evaluated after 
the join; but the condition references only a single column from a small 
dimension table; hive will end up evaluating the same thing over and over 
again...

{code}
explain
select  s_store_name, s_store_id,
sum(case when (d_day_name='Sunday') then ss_sales_price else null end) 
sun_sales,
sum(case when (d_day_name='Monday') then ss_sales_price else null end) 
mon_sales,
sum(case when (d_day_name='Tuesday') then ss_sales_price else  null 
end) tue_sales,
sum(case when (d_day_name='Wednesday') then ss_sales_price else null 
end) wed_sales,
sum(case when (d_day_name='Thursday') then ss_sales_price else null 
end) thu_sales,
sum(case when (d_day_name='Friday') then ss_sales_price else null end) 
fri_sales,
sum(case when (d_day_name='Saturday') then ss_sales_price else null 
end) sat_sales
 from date_dim, store_sales, store
 where d_date_sk = ss_sold_date_sk and
   s_store_sk = ss_store_sk and
   s_gmt_offset = -6 and
   d_year = 1998 
 group by s_store_name, s_store_id
 order by s_store_name, 
s_store_id,sun_sales,mon_sales,tue_sales,wed_sales,thu_sales,fri_sales,sat_sales
 limit 100;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Ptests not working

2018-10-31 Thread Vihang Karajgaonkar
Okay, looks like there was some jenkins issue which was fixed yesterday.
But by that time the queue became very large. Also, many of executors are
down due to disk issue INFRA-17188


On Wed, Oct 31, 2018 at 10:49 AM, Vihang Karajgaonkar 
wrote:

> Created https://issues.apache.org/jira/browse/INFRA-17193
>
> On Wed, Oct 31, 2018 at 1:30 AM, Peter Vary 
> wrote:
>
>> Likely not ptest issue, but restarted ptest anyway.
>>
>> It says:
>> (pending—Waiting for next available executor on Hadoop <
>> https://builds.apache.org/label/Hadoop>)
>>
>> So maybe Apache infra issue?
>>
>> > On Oct 31, 2018, at 01:20, Jesus Camacho Rodriguez <
>> jcamachorodrig...@hortonworks.com> wrote:
>> >
>> > It seems we are not getting executors… Is there any known (infra) issue?
>> >
>> > Thanks,
>> > Jesús
>>
>>
>


Re: Review Request 69107: HIVE-20512

2018-10-31 Thread Bharathkrishna Guruvayoor Murali via Review Board


> On Oct. 31, 2018, 5:29 p.m., Antal Sinkovits wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java
> > Line 62 (original), 83 (patched)
> > 
> >
> > Probably these three rows can move to the MemoryInfoLogger class  (as 
> > static method), as from now on its his responsibility.
> > 
> > Something like MemoryInfoLogger.start() or schedule

It is used by the close method as well, so I will probably leave it there 
itself.


> On Oct. 31, 2018, 5:29 p.m., Antal Sinkovits wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
> > Line 574 (original), 572 (patched)
> > 
> >
> > Is this correct? In batch processing increasing it with only one?

I see that when I go to implementations of reducer.process(batch, 0);, it 
considers the batch as a single row. And there too they are incrementing 
runTimeRows++
So that is why it is probably done in such a way to increment row count by 1 
itself.


- Bharathkrishna


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69107/#review210230
---


On Oct. 26, 2018, 5:13 p.m., Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69107/
> ---
> 
> (Updated Oct. 26, 2018, 5:13 p.m.)
> 
> 
> Review request for hive, Antal Sinkovits, Sahil Takiar, and Vihang 
> Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve record and memory usage logging in SparkRecordHandler
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java 
> 88dd12c05ade417aca4cdaece4448d31d4e1d65f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMergeFileRecordHandler.java
>  8880bb604e088755dcfb0bcb39689702fab0cb77 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java 
> cb5bd7ada2d5ad4f1f654cf80ddaf4504be5d035 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
>  20e7ea0f4e8d4ff79dddeaab0406fc7350d22bd7 
> 
> 
> Diff: https://reviews.apache.org/r/69107/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



Re: Review Request 69107: HIVE-20512

2018-10-31 Thread Bharathkrishna Guruvayoor Murali via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69107/
---

(Updated Oct. 31, 2018, 11:15 p.m.)


Review request for hive, Antal Sinkovits, Sahil Takiar, and Vihang Karajgaonkar.


Changes
---

Added change for threadfactory to have daemon threads.
Addressed review comments. I am using shutDownNow instead of shutDown because 
we really do not need awaitTermination as shutDownNow will anyways interrupt 
the existing threads, and will not accept new tasks.
For awaitTermination, the main thread has to wait for a fixed amount of time 
which I feel is not really useful here, as we are not executing any critical 
task, we do not care about the logger thread as we will anyways log final count 
and memory usage in close method.


Repository: hive-git


Description
---

Improve record and memory usage logging in SparkRecordHandler


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java 
88dd12c05ade417aca4cdaece4448d31d4e1d65f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMergeFileRecordHandler.java
 8880bb604e088755dcfb0bcb39689702fab0cb77 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java 
cb5bd7ada2d5ad4f1f654cf80ddaf4504be5d035 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 
20e7ea0f4e8d4ff79dddeaab0406fc7350d22bd7 


Diff: https://reviews.apache.org/r/69107/diff/4/

Changes: https://reviews.apache.org/r/69107/diff/3-4/


Testing
---


Thanks,

Bharathkrishna Guruvayoor Murali



[jira] [Created] (HIVE-20851) Hive on Spark can't support Hive ACID transcation table updates.

2018-10-31 Thread Tang Yan (JIRA)
Tang Yan created HIVE-20851:
---

 Summary: Hive on Spark can't support Hive ACID transcation table 
updates.
 Key: HIVE-20851
 URL: https://issues.apache.org/jira/browse/HIVE-20851
 Project: Hive
  Issue Type: Bug
  Components: Hive, Spark
Affects Versions: 1.2.1
Reporter: Tang Yan


I've enabled all the configuration to turn on the hive transaction table, and 
it can do insert and updates successfully via beeline with the mr engine. 

When I set the hive.execution.engine=spark, it can't work with the same update 
sql.

No error in the hive on spark job, but the delta filename is always named 
{color:#FF}delta_000_000.{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] New PMC Member : Zoltan

2018-10-31 Thread Sankar Hariappan
Congrats Zoltan!

Best regards
Sankar







On 31/10/18, 11:12 PM, "Vihang Karajgaonkar"  
wrote:

>Congrats Zoltan!
>
>On Wed, Oct 31, 2018 at 1:10 AM, Peter Vary 
>wrote:
>
>> Congratulations Zoltan!
>> Well deserved!
>>
>> > On Oct 31, 2018, at 06:16, Prasanth Jayachandran <
>> pjayachand...@hortonworks.com> wrote:
>> >
>> > Congratulations!
>> >
>> >> On Oct 30, 2018, at 10:15 PM, Deepak Jaiswal 
>> wrote:
>> >>
>> >> Congratulations Zoltan!
>> >>
>> >> On 10/30/18, 10:08 PM, "Ashutosh Chauhan" 
>> wrote:
>> >>
>> >>Hello Hive community,
>> >>
>> >>   I'm pleased to announce that Zoltan Haindrich has accepted the Apache
>> >>   Hive PMC's invitation, and is our newest PMC member. Many thanks to
>> >>   Zoltan for all of his hard work.
>> >>
>> >>   Please join me in congratulating Zoltan!
>> >>
>> >>   Thanks,
>> >>   Ashutosh
>> >>
>> >>
>> >
>>
>>


Re: [ANNOUNCE] New PMC Member : Zoltan

2018-10-31 Thread Peter Vary
Congratulations Zoltan!
Well deserved!

> On Oct 31, 2018, at 06:16, Prasanth Jayachandran 
>  wrote:
> 
> Congratulations!
> 
>> On Oct 30, 2018, at 10:15 PM, Deepak Jaiswal  
>> wrote:
>> 
>> Congratulations Zoltan!
>> 
>> On 10/30/18, 10:08 PM, "Ashutosh Chauhan"  wrote:
>> 
>>Hello Hive community,
>> 
>>   I'm pleased to announce that Zoltan Haindrich has accepted the Apache
>>   Hive PMC's invitation, and is our newest PMC member. Many thanks to
>>   Zoltan for all of his hard work.
>> 
>>   Please join me in congratulating Zoltan!
>> 
>>   Thanks,
>>   Ashutosh
>> 
>> 
> 



Re: Ptests not working

2018-10-31 Thread Peter Vary
Likely not ptest issue, but restarted ptest anyway.

It says:
(pending—Waiting for next available executor on Hadoop 
)

So maybe Apache infra issue?

> On Oct 31, 2018, at 01:20, Jesus Camacho Rodriguez 
>  wrote:
> 
> It seems we are not getting executors… Is there any known (infra) issue?
> 
> Thanks,
> Jesús