[jira] [Created] (HIVE-20970) ORC table with bloom filter fails on PPD query

2018-11-26 Thread Gabriel C Balan (JIRA)
Gabriel C Balan created HIVE-20970:
--

 Summary: ORC table with bloom filter fails on PPD query
 Key: HIVE-20970
 URL: https://issues.apache.org/jira/browse/HIVE-20970
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Hive, ORC
Affects Versions: 2.1.0
Reporter: Gabriel C Balan


I encountered this issue in hive2.1.0-cdh6.0.0.
{noformat:title=Reproducer}
drop table if exists t1;

create table t1(c1 string, c2 int) stored as orc
TBLPROPERTIES ("orc.compress"="NONE", 
   "orc.bloom.filter.columns"="c2");


INSERT INTO TABLE t1 VALUES ("row 1", 1), ("row 2", 2), ("row 3", 3);

--this works fine
set hive.optimize.index.filter=false;
select * from t1 where c2=2;

--this fails
set hive.optimize.index.filter=true;
select * from t1 where c2=2;
{noformat}
These three items are essential to reproducing the issue:
 # hive.optimize.index.filter=true;
 # "orc.compress"="NONE" in TBLPROPERTIES
 # "orc.bloom.filter.columns"="c2" in TBLPROPERTIES

That is, if any of the above mentioned items are taken out, the query will not 
fail anymore.

Finally, here is the stack: 
{noformat:title=Stack trace in log4j file}
java.io.IOException: java.lang.IllegalStateException: InputStream#read(byte[]) 
returned invalid result: 0
The InputStream implementation is buggy.
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2188)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:774)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:701)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:313)
at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
Caused by: java.lang.IllegalStateException: InputStream#read(byte[]) returned 
invalid result: 0
The InputStream implementation is buggy.
at 
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:739)
at 
com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)
at 
com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)
at org.apache.orc.OrcProto$RowIndex.(OrcProto.java:7429)
at org.apache.orc.OrcProto$RowIndex.(OrcProto.java:7393)
at 
org.apache.orc.OrcProto$RowIndex$1.parsePartialFrom(OrcProto.java:7482)
at 
org.apache.orc.OrcProto$RowIndex$1.parsePartialFrom(OrcProto.java:7477)
at 
com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at org.apache.orc.OrcProto$RowIndex.parseFrom(OrcProto.java:7593)
at 
org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readRowIndex(RecordReaderUtils.java:138)
at 
org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1151)
at 
org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1134)
at 
org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:800)
at 
org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:830)
at 
org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986)
at 
org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1021)
at 
org.apache.orc.impl.RecordReaderImpl.(RecordReaderImpl.java:215)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:63)
at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:87)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:314)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.(OrcInputFormat.java:225)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1691)
at 

Re: When do the deltas of a transaction become observable?

2018-11-26 Thread Gopal Vijayaraghavan
 
>release of the locks) but I can't seem to find it. As it's a transactional
>system I'd expect we observe both deltas or none at all, at the point of
>successful commit.

In Hive's internals, "observe" is slightly different from "use". Hive ACID 
system 
can see a file on HDFS and then ignore it, because it is from the "future". 

You can sort of start from this line 

https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hadoop/hive/common/ValidReaderWriteIdList.java#L70

and work backwards.

>I had done some basic tests to determine if the observation semantics were
>tied to the metadata in the database product for the transactional system
>but I could only determine write IDs were influencing this, e.g. if write
>ID = 7 for a given table, then the read would consist of all deltas with a
>write ID < 7.

Yes, you're on the right track. There's a mapping from txn_id -> write_id 
(per-table), maintained by the writers (i.e if a txn commits, then the write_id 
is visible).

For each table, in each query, there's a snapshot taken which has a min:max and 
list of exceptions.

When a query starts it sees that all txns below 5 are all committed or cleaned, 
therefore all <=5 is good.

It knows that highest known txn is 10, so all >10 is to be ignored.

And between 5 & 10, it knows that 7 is aborted and 8 is still open (i.e 
exceptions).

So if it sees a delta_11 dir, it ignores it, If it sees a delta_8, it ignores 
it.

The "ACID" implementation hides future updates in plain sight and doesn't need 
HDFS to be able to rename multiple dirs together.

Most of that smarts is in the split-generation, not in the commit (however, the 
commit does something else to detect write-conflicts which is its own thing).

>If someone could point me in the right direction, or correct my
>understanding then I would greatly appreciate it.

This implementation is built with the txn -> write_id indirection to support 
cross-replication between say an east-coast cluster to a west-coast cluster, 
each owning primary data-sets on their own coasts.

Cheers,
Gopal 




[jira] [Created] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)
Peter Vary created HIVE-20969:
-

 Summary: HoS sessionId generation can cause race conditions when 
uploading files to HDFS
 Key: HIVE-20969
 URL: https://issues.apache.org/jira/browse/HIVE-20969
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 4.0.0
Reporter: Peter Vary
Assignee: Peter Vary


The observed exception is:
{code}
Caused by: java.io.FileNotFoundException: File does not exist: 
/tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
[Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Hive ACID files compacted directory rename on cloud blob stores.

2018-11-26 Thread Gopal Vijayaraghavan


>Oh but s3Guard will not solve the atomicity problem, right?

S3Guard does solve the atomicity problem, because compactors don't just rename 
directories.

The basic consistency needed for ACID is - list after delete and list after 
create (which S3 does not have).

They also place a file named '_orc_acid_version' in the directory.

This happens after rename() returns.

fs.rename(fileStatus.getPath(), newPath);
AcidUtils.OrcAcidVersion.writeVersionFile(newPath, fs);

With S3Guard, all that is needed is to check for that file (& if it is missing 
it is not a complete compacted dir yet).

However, the "open a txn for compact & commit it" is definitely neater.

> So that means that the directory will be "visible while in progress", and
>  the reader might pick up the compacted directory even when all files
> haven't been copied.

In another thread today, I mentioned how ACID is built on top of ignoring 
directories, it can do that easily.

The Parquet or Avro transactional system in Hive boils down to a PathFilter 
with some numbers in the path.

Cheers,
Gopal




[jira] [Created] (HIVE-20972) Enable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-20972:
--

 Summary: Enable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
 Key: HIVE-20972
 URL: https://issues.apache.org/jira/browse/HIVE-20972
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Hive ACID files compacted directory rename on cloud blob stores.

2018-11-26 Thread Abhishek Somani
Thanks for your reply.

To clarify, do you mean that the problem is solved already because Hive
ACID looks at the  '_orc_acid_version' in the directory before assuming a
directory is ready to read? Or do you mean that it *could have *looked at
the file before deciding to read it and that would have been one way to
solve it? Asking because I couldn't find such a check, but I might have
very well missed it.

However, the "open a txn for compact & commit it" is definitely neater.

I agree, and it seems to be have been done this way in HIVE-20823
.

Thanks,
Somani

On Mon, Nov 26, 2018 at 1:57 PM Gopal Vijayaraghavan 
wrote:

>
> >Oh but s3Guard will not solve the atomicity problem, right?
>
> S3Guard does solve the atomicity problem, because compactors don't just
> rename directories.
>
> The basic consistency needed for ACID is - list after delete and list
> after create (which S3 does not have).
>
> They also place a file named '_orc_acid_version' in the directory.
>
> This happens after rename() returns.
>
> fs.rename(fileStatus.getPath(), newPath);
> AcidUtils.OrcAcidVersion.writeVersionFile(newPath, fs);
>
> With S3Guard, all that is needed is to check for that file (& if it is
> missing it is not a complete compacted dir yet).
>
> However, the "open a txn for compact & commit it" is definitely neater.
>
> > So that means that the directory will be "visible while in progress", and
> >  the reader might pick up the compacted directory even when all files
> > haven't been copied.
>
> In another thread today, I mentioned how ACID is built on top of ignoring
> directories, it can do that easily.
>
> The Parquet or Avro transactional system in Hive boils down to a
> PathFilter with some numbers in the path.
>
> Cheers,
> Gopal
>
>
>


Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/
---

(Updated Nov. 27, 2018, 7:18 a.m.)


Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.


Changes
---

Rebased to the latest code on master.


Bugs: HIVE-20740
https://issues.apache.org/jira/browse/HIVE-20740


Repository: hive-git


Description
---

HIVE-20740 : Remove global lock in ObjectStore.setConf method


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
 5a88550f0625a7ec1890df7f54e7fa579f58fff4 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
5cb0a887e672f49739f5b648e608fba66de06326 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
455ffc3887e62fa503cc3fa28255702ea9da3cc0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 570281b54fa236d5bb568b4ded9b166ef367f613 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java
 PRE-CREATION 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 af9efd98ea210335c6ac1d3da8624e02aadc2706 


Diff: https://reviews.apache.org/r/69054/diff/6/

Changes: https://reviews.apache.org/r/69054/diff/5-6/


Testing
---


Thanks,

Vihang Karajgaonkar



[jira] [Created] (HIVE-20973) Optimizer: Reduce de-dup changes the hash-function of a reducer edge

2018-11-26 Thread Gopal V (JIRA)
Gopal V created HIVE-20973:
--

 Summary: Optimizer: Reduce de-dup changes the hash-function of a 
reducer edge
 Key: HIVE-20973
 URL: https://issues.apache.org/jira/browse/HIVE-20973
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V


{code}
private static void 
propagateMaxNumReducers(ReduceSinkJoinDeDuplicateProcCtx dedupCtx,
ReduceSinkOperator rsOp, int maxNumReducers) throws 
SemanticException {
  if (rsOp == null) {
// Bail out
return;
  }
  if (rsOp.getChildOperators().get(0) instanceof MapJoinOperator ||
  rsOp.getChildOperators().get(0) instanceof 
CommonMergeJoinOperator) {
for (Operator p : 
rsOp.getChildOperators().get(0).getParentOperators()) {
  ReduceSinkOperator pRSOp = (ReduceSinkOperator) p;
  pRSOp.getConf().setReducerTraits(EnumSet.of(ReducerTraits.FIXED));
  pRSOp.getConf().setNumReducers(maxNumReducers);
  LOG.debug("Set {} to FIXED parallelism: {}", pRSOp, maxNumReducers);
  if (pRSOp.getConf().isForwarding()) {
ReduceSinkOperator newRSOp =
CorrelationUtilities.findFirstPossibleParent(
pRSOp, ReduceSinkOperator.class, dedupCtx.trustScript());
propagateMaxNumReducers(dedupCtx, newRSOp, maxNumReducers);
  }
}
{code}

FIXED used to mean AUTOPARALLEL=false, but now FIXED means a different hash 
function from UNIFORM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Peter Vary (JIRA)
Peter Vary created HIVE-20971:
-

 Summary: TestJdbcWithDBTokenStore[*] should both use 
MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
 Key: HIVE-20971
 URL: https://issues.apache.org/jira/browse/HIVE-20971
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Peter Vary
Assignee: Peter Vary


The original intent was to use 
MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)