[jira] [Commented] (ASTERIXDB-2235) Normalization exception during a sort

2019-05-17 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842587#comment-16842587
 ] 

Taewoo Kim commented on ASTERIXDB-2235:
---

[~tillw]: Yes. It happened one year ago and it's not reproducible from my side. 
Thus, unless we have been seeing this error again, we can close this. 

> Normalization exception during a sort
> -
>
> Key: ASTERIXDB-2235
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2235
> Project: Apache AsterixDB
>  Issue Type: Bug
>Affects Versions: 0.9.4
>Reporter: Taewoo Kim
>Assignee: Ali Alsuliman
>Priority: Major
>  Labels: triaged
> Fix For: 0.9.4.2
>
>
> A Twittermap query generates that exception during an execution. And the node 
> becomes unavailable.
> {code}
> Jan 09, 2018 5:09:28 PM org.apache.hyracks.control.nc.Task run
> WARNING: Task TAID:TID:ANID:ODID:2:0:7:0 failed with exception
> org.apache.hyracks.api.exceptions.HyracksDataException: 
> java.lang.IllegalStateException: Corrupted string bytes: trying to access 
> entry 318767187 in a byte array of length 32768
>   at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:48)
>   at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:418)
>   at org.apache.hyracks.control.nc.Task.run(Task.java:323)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IllegalStateException: Corrupted string bytes: trying to 
> access entry 318767187 in a byte array of length 32768
>   at 
> org.apache.hyracks.util.encoding.VarLenIntEncoderDecoder.decode(VarLenIntEncoderDecoder.java:82)
>   at 
> org.apache.hyracks.data.std.primitive.ByteArrayPointable.getContentLength(ByteArrayPointable.java:154)
>   at 
> org.apache.hyracks.data.std.primitive.ByteArrayPointable.normalize(ByteArrayPointable.java:174)
>   at 
> org.apache.hyracks.dataflow.common.data.normalizers.ByteArrayNormalizedKeyComputerFactory$1.normalize(ByteArrayNormalizedKeyComputerFactory.java:34)
>   at 
> org.apache.asterix.dataflow.data.nontagged.keynormalizers.AWrappedAscNormalizedKeyComputerFactory$1.normalize(AWrappedAscNormalizedKeyComputerFactory.java:46)
>   at 
> org.apache.hyracks.api.dataflow.value.INormalizedKeyComputer.normalize(INormalizedKeyComputer.java:25)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.sort(AbstractFrameSorter.java:193)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractSortRunGenerator.close(AbstractSortRunGenerator.java:48)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1.close(AbstractSorterOperatorDescriptor.java:132)
>   at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:409)
>   ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2338) IllgalArgumentException happens when a page of an inverted list is read concurrently

2019-04-10 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2338.
-
Resolution: Fixed

> IllgalArgumentException happens when a page of an inverted list is read 
> concurrently
> 
>
> Key: ASTERIXDB-2338
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2338
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> If a page of an inverted list is read concurrently at the same time by 
> multiple threads, the following exceptions happens. This is because a 
> concurrency control when reading a buffer in the buffer cache is not 
> implemented. 
> {code:java}
> org.apache.hyracks.api.exceptions.HyracksDataException: 
> java.lang.IllegalArgumentException
> at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:51)
>  ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:247)
>  ~[hyracks-storage-am-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.write(AbstractFrameAppender.java:93)
>  ~[hyracks-dataflow-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushAndReset(AbstractOneInputOneOutputOneFramePushRuntime.java:78)
>  ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushIfNotFailed(AbstractOneInputOneOutputOneFramePushRuntime.java:84)
>  ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:56)
>  ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.close(AssignRuntimeFactory.java:119)
>  ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:65)
>  ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$SourcePushRuntime.initialize(AlgebricksMetaOperatorDescriptor.java:111)
>  ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$$Lambda$74/144499656.run(Unknown
>  Source) ~[?:?]
> at 
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$9(SuperActivityOperatorNodePushable.java:204)
>  ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$$Lambda$76/2082033757.call(Unknown
>  Source) ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0]
> at java.lang.Thread.run(Thread.java:744) ~[?:1.8.0]
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.position(Buffer.java:244) ~[?:1.8.0]
> at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:209) ~[?:1.8.0]
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.FixedSizeElementInvertedListCursor.loadPages(FixedSizeElementInvertedListCursor.java:225)
>  ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.search.TOccurrenceSearcher.search(TOccurrenceSearcher.java:72)
>  ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.search(OnDiskInvertedIndex.java:453)
>  ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexSearchCursor.doHasNext(LSMInvertedIndexSearchCursor.java:159)
>  ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.common.EnforcedIndexCursor.hasNext(EnforcedIndexCursor.java:69)
>  ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> 

[jira] [Created] (ASTERIXDB-2517) Ingestion process failed on a cluster with two machines.

2019-02-07 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2517:
-

 Summary: Ingestion process failed on a cluster with two machines.
 Key: ASTERIXDB-2517
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2517
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim
 Attachments: cc.log, nc-1.log

We have a cluster with two machines. Out of 1.5 billion records, about 1.2 
billion records were ingested using a socket adapter. However, the NC-1, which 
is located on the same machine, was shutdown. The time was around 19:21 (please 
see the log records around that time). I have attached the log records.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2487) Cluster becomes UNUSUABLE with "java.lang.IllegalStateException: Couldn't find any checkpoints for resource"

2018-11-27 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2487.
-
Resolution: Invalid

> Cluster becomes UNUSUABLE with "java.lang.IllegalStateException: Couldn't 
> find any checkpoints for resource"
> 
>
> Key: ASTERIXDB-2487
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2487
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
> Attachments: nc-1.log
>
>
> The Clouberry cluster became UNUSUABLE after the nc-1 (among five NCs) 
> generated the following exception.
>  
> {code:java}
> 21:32:10.659 [Executor-10173:1] WARN 
> org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any 
> checkpoint file for index 
> io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382.
>  Content of dir are null.
> 21:32:10.659 [Executor-10172:1] WARN 
> org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any 
> checkpoint file for index 
> io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c.
>  Content of dir are null.
> 21:32:10.660 [Executor-10173:1] ERROR 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
> operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
> "/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
>  "memory" : [{"class":"LSMBTreeMemoryComponent", 
> "state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
> "pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
> "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
> "id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
> "current-memory-component":1}
> 21:32:10.660 [Executor-10173:1] ERROR 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
> operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
> "/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
>  "memory" : [{"class":"LSMBTreeMemoryComponent", 
> "state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
> "pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
> "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
> "id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
> "current-memory-component":1}
> java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 
> io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382
> at 
> org.apache.asterix.app.nc.IndexCheckpointManager.getLatest(IndexCheckpointManager.java:145)
>  ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.IndexCheckpointManager.flushed(IndexCheckpointManager.java:86)
>  ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.addComponentToCheckpoint(LSMIOOperationCallback.java:136)
>  ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.afterFinalize(LSMIOOperationCallback.java:123)
>  ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doIo(LSMHarness.java:544)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.flush(LSMHarness.java:513)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.flush(LSMTreeIndexAccessor.java:122)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:38)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:29)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_161]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_161]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_161]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
> 21:32:10.663 [Executor-10172:1] ERROR 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
> operation.afterFinalize failed on {"class" : 

[jira] [Commented] (ASTERIXDB-2487) Cluster becomes UNUSUABLE with "java.lang.IllegalStateException: Couldn't find any checkpoints for resource"

2018-11-27 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700860#comment-16700860
 ] 

Taewoo Kim commented on ASTERIXDB-2487:
---

[~luochen01] suggested that the ulimit setting is too small. I verified that 
the number for "open files" is 1024. I increased it to 100,000. 

> Cluster becomes UNUSUABLE with "java.lang.IllegalStateException: Couldn't 
> find any checkpoints for resource"
> 
>
> Key: ASTERIXDB-2487
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2487
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
> Attachments: nc-1.log
>
>
> The Clouberry cluster became UNUSUABLE after the nc-1 (among five NCs) 
> generated the following exception.
>  
> {code:java}
> 21:32:10.659 [Executor-10173:1] WARN 
> org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any 
> checkpoint file for index 
> io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382.
>  Content of dir are null.
> 21:32:10.659 [Executor-10172:1] WARN 
> org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any 
> checkpoint file for index 
> io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c.
>  Content of dir are null.
> 21:32:10.660 [Executor-10173:1] ERROR 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
> operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
> "/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
>  "memory" : [{"class":"LSMBTreeMemoryComponent", 
> "state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
> "pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
> "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
> "id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
> "current-memory-component":1}
> 21:32:10.660 [Executor-10173:1] ERROR 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
> operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
> "/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
>  "memory" : [{"class":"LSMBTreeMemoryComponent", 
> "state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
> "pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
> "state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
> "id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
> "current-memory-component":1}
> java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 
> io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382
> at 
> org.apache.asterix.app.nc.IndexCheckpointManager.getLatest(IndexCheckpointManager.java:145)
>  ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.IndexCheckpointManager.flushed(IndexCheckpointManager.java:86)
>  ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.addComponentToCheckpoint(LSMIOOperationCallback.java:136)
>  ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.afterFinalize(LSMIOOperationCallback.java:123)
>  ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doIo(LSMHarness.java:544)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.flush(LSMHarness.java:513)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.flush(LSMTreeIndexAccessor.java:122)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:38)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:29)
>  [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_161]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_161]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_161]
> at java.lang.Thread.run(Thread.java:748) 

[jira] [Updated] (ASTERIXDB-2487) Cluster becomes UNUSUABLE with "java.lang.IllegalStateException: Couldn't find any checkpoints for resource"

2018-11-27 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2487:
--
Description: 
The Clouberry cluster became UNUSUABLE after the nc-1 (among five NCs) 
generated the following exception.

 
{code:java}
21:32:10.659 [Executor-10173:1] WARN 
org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any checkpoint 
file for index 
io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382.
 Content of dir are null.
21:32:10.659 [Executor-10172:1] WARN 
org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any checkpoint 
file for index 
io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c.
 Content of dir are null.
21:32:10.660 [Executor-10173:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
"current-memory-component":1}
21:32:10.660 [Executor-10173:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
"current-memory-component":1}
java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 
io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382
at 
org.apache.asterix.app.nc.IndexCheckpointManager.getLatest(IndexCheckpointManager.java:145)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.app.nc.IndexCheckpointManager.flushed(IndexCheckpointManager.java:86)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.addComponentToCheckpoint(LSMIOOperationCallback.java:136)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.afterFinalize(LSMIOOperationCallback.java:123)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doIo(LSMHarness.java:544)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.flush(LSMHarness.java:513)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.flush(LSMTreeIndexAccessor.java:122)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:38)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:29)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
21:32:10.663 [Executor-10172:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[24,24]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[23,23]"}], "disk" : 4, "num-scheduled-flushes":1, 
"current-memory-component":1}
java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 

[jira] [Updated] (ASTERIXDB-2487) Cluster becomes UNUSUABLE with "java.lang.IllegalStateException: Couldn't find any checkpoints for resource"

2018-11-26 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2487:
--
Description: 
The Clouberry cluster became UNUSUABLE after the nc-1 (among five NCs) 
generated the following exception.

 
{code:java}
21:32:10.659 [Executor-10173:1] WARN 
org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any checkpoint 
file for index 
io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382.
 Content of dir are null.
21:32:10.659 [Executor-10172:1] WARN 
org.apache.asterix.app.nc.IndexCheckpointManager - Couldn't find any checkpoint 
file for index 
io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c.
 Content of dir are null.
21:32:10.660 [Executor-10173:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
"current-memory-component":1}
21:32:10.660 [Executor-10173:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
"current-memory-component":1}
java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 
io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382
at 
org.apache.asterix.app.nc.IndexCheckpointManager.getLatest(IndexCheckpointManager.java:145)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.app.nc.IndexCheckpointManager.flushed(IndexCheckpointManager.java:86)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.addComponentToCheckpoint(LSMIOOperationCallback.java:136)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.afterFinalize(LSMIOOperationCallback.java:123)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doIo(LSMHarness.java:544)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.flush(LSMHarness.java:513)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.flush(LSMTreeIndexAccessor.java:122)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:38)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:29)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
21:32:10.663 [Executor-10172:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[24,24]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[23,23]"}], "disk" : 4, "num-scheduled-flushes":1, 
"current-memory-component":1}
java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 

[jira] [Created] (ASTERIXDB-2487) Cluster becomes UNUSUABLE with "java.lang.IllegalStateException: Couldn't find any checkpoints for resource"

2018-11-26 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2487:
-

 Summary: Cluster becomes UNUSUABLE with 
"java.lang.IllegalStateException: Couldn't find any checkpoints for resource"
 Key: ASTERIXDB-2487
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2487
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim
 Attachments: nc-1.log

The Clouberry cluster became UNUSUABLE after the nc-1 (among five NCs) 
generated the following exception.

 
{code:java}
21:32:10.660 [Executor-10173:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[9,9]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[8,8]"}], "disk" : 3, "num-scheduled-flushes":1, 
"current-memory-component":1}
java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 
io1/storage/partition_0/twitter/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382/0/ds_tweet_e9ad9c2394f7dc7b6a69fb43e52a7382
at 
org.apache.asterix.app.nc.IndexCheckpointManager.getLatest(IndexCheckpointManager.java:145)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.app.nc.IndexCheckpointManager.flushed(IndexCheckpointManager.java:86)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.addComponentToCheckpoint(LSMIOOperationCallback.java:136)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.afterFinalize(LSMIOOperationCallback.java:123)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doIo(LSMHarness.java:544)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.flush(LSMHarness.java:513)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.flush(LSMTreeIndexAccessor.java:122)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:38)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.FlushOperation.call(FlushOperation.java:29)
 [hyracks-storage-am-lsm-common-0.3.5-SNAPSHOT.jar:0.3.5-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
21:32:10.663 [Executor-10172:1] ERROR 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - FLUSH 
operation.afterFinalize failed on {"class" : "LSMBTree", "dir" : 
"/home/waans11/asterixdb/io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c",
 "memory" : [{"class":"LSMBTreeMemoryComponent", 
"state":"READABLE_UNWRITABLE_FLUSHING", "writers":0, "readers":1, 
"pendingFlushes":0, "id":"[24,24]"}, {"class":"LSMBTreeMemoryComponent", 
"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, 
"id":"[23,23]"}], "disk" : 4, "num-scheduled-flushes":1, 
"current-memory-component":1}
java.lang.IllegalStateException: Couldn't find any checkpoints for resource: 
io2/storage/partition_1/twitter/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c/0/ds_tweet_9460370bb0ca1c98a779b1bcc6861c2c
at 
org.apache.asterix.app.nc.IndexCheckpointManager.getLatest(IndexCheckpointManager.java:145)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.app.nc.IndexCheckpointManager.flushed(IndexCheckpointManager.java:86)
 ~[asterix-app-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.addComponentToCheckpoint(LSMIOOperationCallback.java:136)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.afterFinalize(LSMIOOperationCallback.java:123)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.addComponentToCheckpoint(LSMIOOperationCallback.java:136)
 ~[asterix-common-0.9.5-SNAPSHOT.jar:0.9.5-SNAPSHOT]
at 
org.apache.asterix.common.ioopcallbacks.LSMIOOperationCallback.afterFinalize(LSMIOOperationCallback.java:123)
 

[jira] [Commented] (ASTERIXDB-2481) Out of Memory error doing aggregation

2018-11-14 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687291#comment-16687291
 ] 

Taewoo Kim commented on ASTERIXDB-2481:
---

Can you also attach the log records?

> Out of Memory error doing aggregation
> -
>
> Key: ASTERIXDB-2481
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2481
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler, RT - Runtime, SQL - Translator SQL++
>Affects Versions: 0.9.5
> Environment: Linux
>Reporter: Gift Sinthong
>Priority: Critical
> Attachments: Screen Shot 2018-11-14 at 3.12.31 PM.png
>
>
> This is the schema for this query:
> CREATE TYPE Test AS open{
>  unique2: int64
> };
> CREATE DATASET wisconsin_1gb(Test)
>  PRIMARY KEY unique2;
> This is the query:
> SELECT min( t.oddOnePercent) as min, max(t.oddOnePercent) as max, 
> count(distinct t.oddOnePercent) as cnt
>  FROM wisconsin_5gb t ;
>  
> The plan for this query:
> distribute result [$$46]
> -- DISTRIBUTE_RESULT |UNPARTITIONED|
>  exchange
>  -- ONE_TO_ONE_EXCHANGE |UNPARTITIONED|
>  project ([$$46])
>  -- STREAM_PROJECT |UNPARTITIONED|
>  assign [$$46] <- [\{"min": $$48, "max": $$49, "cnt": $$50}]
>  -- ASSIGN |UNPARTITIONED|
>  project ([$$48, $$49, $$50])
>  -- STREAM_PROJECT |UNPARTITIONED|
>  subplan {
>  aggregate [$$50] <- [agg-sql-sum($$53)]
>  -- AGGREGATE |LOCAL|
>  aggregate [$$53] <- [agg-sql-count($$43)]
>  -- AGGREGATE |LOCAL|
>  distinct ([$$43])
>  -- MICRO_PRE_SORTED_DISTINCT_BY |LOCAL|
>  order (ASC, $$43) 
>  -- IN_MEMORY_STABLE_SORT [$$43(ASC)] |LOCAL|
>  assign [$$43] <- [$$52.getField("oddOnePercent")]
>  -- ASSIGN |UNPARTITIONED|
>  assign [$$52] <- [$#4.getField(0)]
>  -- ASSIGN |UNPARTITIONED|
>  unnest $#4 <- scan-collection($$28)
>  -- UNNEST |UNPARTITIONED|
>  nested tuple source
>  -- NESTED_TUPLE_SOURCE |UNPARTITIONED|
>  }
>  -- SUBPLAN |UNPARTITIONED|
>  aggregate [$$28, $$48, $$49] <- [listify($$27), agg-sql-min($$33), 
> agg-sql-max($$33)]
>  -- AGGREGATE |UNPARTITIONED|
>  exchange
>  -- RANDOM_MERGE_EXCHANGE |PARTITIONED|
>  project ([$$27, $$33])
>  -- STREAM_PROJECT |PARTITIONED|
>  assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), \{"t": $$t}]
>  -- ASSIGN |PARTITIONED|
>  project ([$$t])
>  -- STREAM_PROJECT |PARTITIONED|
>  exchange
>  -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
>  data-scan []<-[$$47, $$t] <- benchmark.wisconsin_5gb
>  -- DATASOURCE_SCAN |PARTITIONED|
>  exchange
>  -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
>  empty-tuple-source
>  -- EMPTY_TUPLE_SOURCE |PARTITIONED|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ASTERIXDB-2478) Feed Ingestion stucks after ingesting millions records.

2018-11-13 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2478:
--
Summary: Feed Ingestion stucks after ingesting millions records.  (was: 
Feed Ingestion stuck after ingesting millions records.)

> Feed Ingestion stucks after ingesting millions records.
> ---
>
> Key: ASTERIXDB-2478
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2478
> Project: Apache AsterixDB
>  Issue Type: Bug
>Affects Versions: 0.9.4
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
> Attachments: jstack-11132018.zip, logs-11132018.zip, logs2.tar.gz
>
>
> This is the new cloudberry cluster that consists of five NUC machines. 
> Originally, about three months ago, we were able to ingest 1.2 billion 
> records to this cluster. The day before yesterday, we started ingesting our 
> tweet records using the current master branch. After ingesting 70 million 
> records, the feed process was not progressing at all. The log record at that 
> moment did not show any particular errors.   
> [~idleft] and [~luochen01] investigated this issue and suggested us to try 
> another commit (b2eb44177e7ac6bd180c7bc325cf007fb925c9e2) to see whether it 
> works. It does not work. The symptom happened again. 
>  
> Here, (1) the log records for the current master and (2) the log records for 
> the commit b2eb44177e7ac6bd180c7bc325cf007fb925c9e2 are attached.
> (1) 
> [https://drive.google.com/file/d/1LZNPjcIvZXbYYk53wZ4Pc6emfjCwFrUi/view?usp=sharing]
> (2) logs2.tar.gz (see the attachment)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ASTERIXDB-2478) Feed Ingestion stuck after ingesting millions records.

2018-11-13 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2478:
--
Attachment: jstack-11132018.zip

> Feed Ingestion stuck after ingesting millions records.
> --
>
> Key: ASTERIXDB-2478
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2478
> Project: Apache AsterixDB
>  Issue Type: Bug
>Affects Versions: 0.9.4
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
> Attachments: jstack-11132018.zip, logs-11132018.zip, logs2.tar.gz
>
>
> This is the new cloudberry cluster that consists of five NUC machines. 
> Originally, about three months ago, we were able to ingest 1.2 billion 
> records to this cluster. The day before yesterday, we started ingesting our 
> tweet records using the current master branch. After ingesting 70 million 
> records, the feed process was not progressing at all. The log record at that 
> moment did not show any particular errors.   
> [~idleft] and [~luochen01] investigated this issue and suggested us to try 
> another commit (b2eb44177e7ac6bd180c7bc325cf007fb925c9e2) to see whether it 
> works. It does not work. The symptom happened again. 
>  
> Here, (1) the log records for the current master and (2) the log records for 
> the commit b2eb44177e7ac6bd180c7bc325cf007fb925c9e2 are attached.
> (1) 
> [https://drive.google.com/file/d/1LZNPjcIvZXbYYk53wZ4Pc6emfjCwFrUi/view?usp=sharing]
> (2) logs2.tar.gz (see the attachment)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ASTERIXDB-2478) Feed Ingestion stuck after ingesting millions records.

2018-11-13 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2478:
--
Attachment: logs.tar.gz

> Feed Ingestion stuck after ingesting millions records.
> --
>
> Key: ASTERIXDB-2478
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2478
> Project: Apache AsterixDB
>  Issue Type: Bug
>Affects Versions: 0.9.4
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
> Attachments: logs2.tar.gz
>
>
> This is the new cloudberry cluster that consists of five NUC machines. 
> Originally, about three months ago, we were able to ingest 1.2 billion 
> records to this cluster. The day before yesterday, we started ingesting our 
> tweet records using the current master branch. After ingesting 70 million 
> records, the feed process was not progressing at all. The log record at that 
> moment did not show any particular errors.   
> [~idleft] and [~luochen01] investigated this issue and suggested us to try 
> another commit (b2eb44177e7ac6bd180c7bc325cf007fb925c9e2) to see whether it 
> works. It does not work. The symptom happened again. 
>  
> Here, (1) the log records for the current master and (2) the log records for 
> the commit b2eb44177e7ac6bd180c7bc325cf007fb925c9e2 are attached.
> (1) 
> [https://drive.google.com/file/d/1LZNPjcIvZXbYYk53wZ4Pc6emfjCwFrUi/view?usp=sharing]
> (2) logs2.tar.gz (see the attachment)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2478) Feed Ingestion stuck after ingesting millions records.

2018-11-13 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686114#comment-16686114
 ] 

Taewoo Kim commented on ASTERIXDB-2478:
---

Tried on the current master (as of 11/13/2018) and the ingestion process was 
stuck again. I have attached the log files and network API messages from NCs. 
[^logs-11132018.zip]

> Feed Ingestion stuck after ingesting millions records.
> --
>
> Key: ASTERIXDB-2478
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2478
> Project: Apache AsterixDB
>  Issue Type: Bug
>Affects Versions: 0.9.4
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
> Attachments: logs-11132018.zip, logs2.tar.gz
>
>
> This is the new cloudberry cluster that consists of five NUC machines. 
> Originally, about three months ago, we were able to ingest 1.2 billion 
> records to this cluster. The day before yesterday, we started ingesting our 
> tweet records using the current master branch. After ingesting 70 million 
> records, the feed process was not progressing at all. The log record at that 
> moment did not show any particular errors.   
> [~idleft] and [~luochen01] investigated this issue and suggested us to try 
> another commit (b2eb44177e7ac6bd180c7bc325cf007fb925c9e2) to see whether it 
> works. It does not work. The symptom happened again. 
>  
> Here, (1) the log records for the current master and (2) the log records for 
> the commit b2eb44177e7ac6bd180c7bc325cf007fb925c9e2 are attached.
> (1) 
> [https://drive.google.com/file/d/1LZNPjcIvZXbYYk53wZ4Pc6emfjCwFrUi/view?usp=sharing]
> (2) logs2.tar.gz (see the attachment)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ASTERIXDB-2478) Feed Ingestion stuck after ingesting millions records.

2018-11-13 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2478:
--
Attachment: logs-11132018.zip

> Feed Ingestion stuck after ingesting millions records.
> --
>
> Key: ASTERIXDB-2478
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2478
> Project: Apache AsterixDB
>  Issue Type: Bug
>Affects Versions: 0.9.4
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
> Attachments: logs-11132018.zip, logs2.tar.gz
>
>
> This is the new cloudberry cluster that consists of five NUC machines. 
> Originally, about three months ago, we were able to ingest 1.2 billion 
> records to this cluster. The day before yesterday, we started ingesting our 
> tweet records using the current master branch. After ingesting 70 million 
> records, the feed process was not progressing at all. The log record at that 
> moment did not show any particular errors.   
> [~idleft] and [~luochen01] investigated this issue and suggested us to try 
> another commit (b2eb44177e7ac6bd180c7bc325cf007fb925c9e2) to see whether it 
> works. It does not work. The symptom happened again. 
>  
> Here, (1) the log records for the current master and (2) the log records for 
> the commit b2eb44177e7ac6bd180c7bc325cf007fb925c9e2 are attached.
> (1) 
> [https://drive.google.com/file/d/1LZNPjcIvZXbYYk53wZ4Pc6emfjCwFrUi/view?usp=sharing]
> (2) logs2.tar.gz (see the attachment)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ASTERIXDB-2478) Feed Ingestion stuck after ingesting millions records.

2018-11-13 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2478:
--
Attachment: (was: logs.tar.gz)

> Feed Ingestion stuck after ingesting millions records.
> --
>
> Key: ASTERIXDB-2478
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2478
> Project: Apache AsterixDB
>  Issue Type: Bug
>Affects Versions: 0.9.4
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
> Attachments: logs2.tar.gz
>
>
> This is the new cloudberry cluster that consists of five NUC machines. 
> Originally, about three months ago, we were able to ingest 1.2 billion 
> records to this cluster. The day before yesterday, we started ingesting our 
> tweet records using the current master branch. After ingesting 70 million 
> records, the feed process was not progressing at all. The log record at that 
> moment did not show any particular errors.   
> [~idleft] and [~luochen01] investigated this issue and suggested us to try 
> another commit (b2eb44177e7ac6bd180c7bc325cf007fb925c9e2) to see whether it 
> works. It does not work. The symptom happened again. 
>  
> Here, (1) the log records for the current master and (2) the log records for 
> the commit b2eb44177e7ac6bd180c7bc325cf007fb925c9e2 are attached.
> (1) 
> [https://drive.google.com/file/d/1LZNPjcIvZXbYYk53wZ4Pc6emfjCwFrUi/view?usp=sharing]
> (2) logs2.tar.gz (see the attachment)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2478) Feed Ingestion stucks after ingesting millions records.

2018-11-10 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2478:
-

 Summary: Feed Ingestion stucks after ingesting millions records.
 Key: ASTERIXDB-2478
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2478
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim
Assignee: Murtadha Hubail
 Attachments: logs2.tar.gz

This is the new cloudberry cluster that consists of five NUC machines. 
Originally, about three months ago, we were able to ingest 1.2 billion records 
to this cluster. The day before yesterday, we started ingesting our tweet 
records using the current master branch. After ingesting 70 million records, 
the feed process was not progressing at all. The log record at that moment did 
not show any particular errors.   

[~idleft] and [~luochen01] investigated this issue and suggested us to try 
another commit (b2eb44177e7ac6bd180c7bc325cf007fb925c9e2) to see whether it 
works. It does not work. The symptom happened again. 

 

Here, (1) the log records for the current master and (2) the log records for 
the commit b2eb44177e7ac6bd180c7bc325cf007fb925c9e2 are attached.

(1) 
[https://drive.google.com/file/d/1LZNPjcIvZXbYYk53wZ4Pc6emfjCwFrUi/view?usp=sharing]

(2) logs2.tar.gz (see the attachment)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2454) Remove most AQL test cases

2018-10-17 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2454.
-
Resolution: Fixed

https://asterix-gerrit.ics.uci.edu/#/c/2979/

> Remove most AQL test cases
> --
>
> Key: ASTERIXDB-2454
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2454
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Ian Maxon
>Priority: Major
>
> The AQL language is deprecated now. It is only used for similarity join via 
> the AQL+ framework. Most of the test cases that are not directly related to 
> the similarity join can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2454) Remove most AQL test cases

2018-10-17 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654134#comment-16654134
 ] 

Taewoo Kim commented on ASTERIXDB-2454:
---

We will remove all AQL runtime test cases that satisfy the following conditions.

 

(1) There are corresponding SQL++ test cases.

(2) Not in the "fuzzyjoin" test group (directory).

> Remove most AQL test cases
> --
>
> Key: ASTERIXDB-2454
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2454
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Ian Maxon
>Priority: Major
>
> The AQL language is deprecated now. It is only used for similarity join via 
> the AQL+ framework. Most of the test cases that are not directly related to 
> the similarity join can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ASTERIXDB-2454) Remove most AQL test cases

2018-10-17 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2454:
-

Assignee: Ian Maxon  (was: Taewoo Kim)

> Remove most AQL test cases
> --
>
> Key: ASTERIXDB-2454
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2454
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Ian Maxon
>Priority: Major
>
> The AQL language is deprecated now. It is only used for similarity join via 
> the AQL+ framework. Most of the test cases that are not directly related to 
> the similarity join can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2465) CSV Output documentation needs to be updated.

2018-10-14 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2465:
-

 Summary: CSV Output documentation needs to be updated.
 Key: ASTERIXDB-2465
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2465
 Project: Apache AsterixDB
  Issue Type: Improvement
Reporter: Taewoo Kim


CSV Output documentation still uses AQL.

[https://ci.apache.org/projects/asterixdb/csv.html] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2455) Revise the documentation to deprecate the AQL section

2018-09-24 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2455.
-
Resolution: Fixed

> Revise the documentation to deprecate the AQL section
> -
>
> Key: ASTERIXDB-2455
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2455
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> The document section needs to be updated to deprecate AQL. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ASTERIXDB-2455) Revise the documentation to deprecate the AQL section

2018-09-24 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2455:
-

Assignee: Taewoo Kim

> Revise the documentation to deprecate the AQL section
> -
>
> Key: ASTERIXDB-2455
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2455
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> The document section needs to be updated to deprecate AQL. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2455) Revise the documentation to deprecate the AQL section

2018-09-24 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2455:
-

 Summary: Revise the documentation to deprecate the AQL section
 Key: ASTERIXDB-2455
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2455
 Project: Apache AsterixDB
  Issue Type: Improvement
Reporter: Taewoo Kim


The document section needs to be updated to deprecate AQL. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2437) Index-only plan cannot be generated for a composite index if both fields are used and only one field is returned.

2018-09-24 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2437.
-
Resolution: Fixed

> Index-only plan cannot be generated for a composite index if both fields are 
> used and only one field is returned.
> -
>
> Key: ASTERIXDB-2437
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2437
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> When there is a composite index,  if both fields are used in the select 
> condition and only one field is returned, the compiler generates an exception.
>  
> DDL and query:
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
>     id: int64,
>     name: string,
>     screen_name : string,
>     profile_image_url : string,
>     lang : string,
>     location: string,
>     create_at: date,
>     description: string,
>     followers_count: int32,
>     friends_count: int32,
>     statues_count: int64
> };
> create type typePlace if not exists as open{
>     country : string,
>     country_code : string,
>     full_name : string,
>     id : string,
>     name : string,
>     place_type : string,
>     bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
>     stateID: int32,
>     stateName: string,
>     countyID: int32,
>     countyName: string,
>     cityID: int32?,
>     cityName: string?
> };
> create type typeTweet if not exists as open {
>     create_at : datetime,
>     id: int64,
>     text: string,
>     in_reply_to_status : int64,
>     in_reply_to_user : int64,
>     favorite_count : int64,
>     coordinate: point?,
>     retweet_count : int64,
>     lang : string,
>     is_retweet: boolean,
>     hashtags : {{ string }} ?,
>     user_mentions : {{ int64 }} ? ,
>     user : typeUser,
>     place : typePlace?,
>     geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id;
> create index create_at_status_count_idx on ds_tweet(user.create_at, 
> user.statues_count);
> select value count(first.create_at) from (
> select t.user.create_at, t.user.statues_count, t.id from ds_tweet t
> where
>       t.user.create_at   >=
>       date_from_unix_time_in_days(1) and
>       t.user.create_at   <
>       date_from_unix_time_in_days(12000) and
>           t.user.statues_count  >= 0 and
>           t.user.statues_count  <  100
> ) first;
> {code}
>  
> Exception:
> {code:java}
> org.apache.hyracks.algebricks.common.exceptions.AlgebricksException: Could 
> not infer type for variable '$$57'.
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitVariableReferenceExpression(SetClosedRecordConstructorsRule.java:170)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitVariableReferenceExpression(SetClosedRecordConstructorsRule.java:77)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.core.algebra.expressions.VariableReferenceExpression.accept(VariableReferenceExpression.java:93)
>  ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:150)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:77)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.core.algebra.visitors.AbstractConstVarFunVisitor.visitScalarFunctionCallExpression(AbstractConstVarFunVisitor.java:39)
>  ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.core.algebra.expressions.ScalarFunctionCallExpression.accept(ScalarFunctionCallExpression.java:55)
>  ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:150)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:77)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> 

[jira] [Commented] (ASTERIXDB-2454) Remove most AQL test cases

2018-09-24 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626460#comment-16626460
 ] 

Taewoo Kim commented on ASTERIXDB-2454:
---

[~dlychagin-cb] : that's important to check. Thanks.

> Remove most AQL test cases
> --
>
> Key: ASTERIXDB-2454
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2454
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> The AQL language is deprecated now. It is only used for similarity join via 
> the AQL+ framework. Most of the test cases that are not directly related to 
> the similarity join can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2349) Similarity documentation needs to move from AQL to SQL++

2018-09-23 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2349.
-
Resolution: Fixed

> Similarity documentation needs to move from AQL to SQL++
> 
>
> Key: ASTERIXDB-2349
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2349
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: *DB - AsterixDB, DOC - Documentation
>Affects Versions: 0.9.3, 0.9.4
>Reporter: Michael J. Carey
>Assignee: Taewoo Kim
>Priority: Major
> Fix For: 0.9.4
>
>
> The similarity search documentation 
> ([https://asterixdb.apache.org/docs/0.9.3/aql/similarity.html)] has all of 
> its examples in AQL still.  (Unlike, e.g., the full-text stuff, which has 
> been nicely updated.)
> Some in search-land needs to fix this...?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ASTERIXDB-1880) Unequal number of valid ... exception during a similarity join query

2018-09-23 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-1880:
-

Assignee: (was: Taewoo Kim)

> Unequal number of valid ... exception during a similarity join query
> 
>
> Key: ASTERIXDB-1880
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1880
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
>
> On an 8-node cluster, the following similarity query generates the following 
> exception. There is a keyword index on the summary field.
> {code}
> Unequal number of valid Dictionary BTree, Inverted Lists, Deleted BTree, and 
> Bloom Filter files found. Aborting cleanup. [HyracksDataException]
> {code}
> {code}
> use dataverse exp;
> count(
> for $p in dataset
> "AmazonReviewProductID"
> for $o in dataset
> "AmazonReviewNoDup"
> for $i in dataset
> "AmazonReviewNoDup"
> where $p.asin /* +indexnl */ = $o.asin and $p.id >=
> int64("6450")
> and $p.id <=
> int64("7449")
> and /* +indexnl */ similarity-jaccard(word-tokens($o.summary), 
> word-tokens($i.summary)) >= 0.8 and $o.id < $i.id
> return {"oid":$o.id, "iid":$i.id}
> );
> {code}
> DDL
> {code}
> drop dataverse exp if exists;
> create dataverse exp;
> use dataverse exp;
> create type AmazonReviewType as open {
>   id: uuid
> }
> create dataset AmazonReviewNoDup(AmazonReviewType) primary key id 
> autogenerated;
> create index AmazonReviewNoDup_summary_kw_idx 
> on AmazonReviewNoDup(summary:string?) type keyword enforced;
> create type AmazonProductIDType as closed {
>   id: int64,
>   asin: string
> }
> create dataset AmazonReviewProductID(AmazonProductIDType) primary key id;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2454) Remove most AQL test cases

2018-09-23 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2454:
-

 Summary: Remove most AQL test cases
 Key: ASTERIXDB-2454
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2454
 Project: Apache AsterixDB
  Issue Type: Improvement
Reporter: Taewoo Kim
Assignee: Taewoo Kim


The AQL language is deprecated now. It is only used for similarity join via the 
AQL+ framework. Most of the test cases that are not directly related to the 
similarity join can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2374) Index-only plan does not work as expected.

2018-09-06 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2374.
-
Resolution: Fixed

> Index-only plan does not work as expected.
> --
>
> Key: ASTERIXDB-2374
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2374
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> Currently, the index-only plan is generated, but none of the search result 
> goes to the index-only-plan path. That is, every search result from a 
> secondary index search is fed into the primary index.
>  
> (EDIT) The cause is that if we fetch a tuple from disk, the result of 
> searchCallback.proceed() always needs to be set to true. And it isn't now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2443) The current word tokenizer is too restricted.

2018-08-15 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581756#comment-16581756
 ] 

Taewoo Kim commented on ASTERIXDB-2443:
---

[~che...@gmail.com] and I had a discussion. And we had a short discussion with 
[~dtabass]. An intermediate solution could be changing from a whitelist 
(defining all non-delimiters) approach to a blacklist approach (defining all 
delimiters).

> The current word tokenizer is too restricted.
> -
>
> Key: ASTERIXDB-2443
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2443
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> The current tokenizer is too restricted. It treats all characters except 
> alphanumeric characters (A-Za-z0-9) as a delimiter. As a consequence, all 
> international characters are treated as a delimiter. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2443) The current word tokenizer is too restricted.

2018-08-15 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2443:
-

 Summary: The current word tokenizer is too restricted.
 Key: ASTERIXDB-2443
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2443
 Project: Apache AsterixDB
  Issue Type: Improvement
Reporter: Taewoo Kim
Assignee: Taewoo Kim


The current tokenizer is too restricted. It treats all characters except 
alphanumeric characters (A-Za-z0-9) as a delimiter. As a consequence, all 
international characters are treated as a delimiter. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ASTERIXDB-2437) Index-only plan cannot be generated for a composite index if both fields are used and only one field is returned.

2018-08-07 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2437:
-

Assignee: Taewoo Kim

> Index-only plan cannot be generated for a composite index if both fields are 
> used and only one field is returned.
> -
>
> Key: ASTERIXDB-2437
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2437
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> When there is a composite index,  if both fields are used in the select 
> condition and only one field is returned, the compiler generates an exception.
>  
> DDL and query:
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
>     id: int64,
>     name: string,
>     screen_name : string,
>     profile_image_url : string,
>     lang : string,
>     location: string,
>     create_at: date,
>     description: string,
>     followers_count: int32,
>     friends_count: int32,
>     statues_count: int64
> };
> create type typePlace if not exists as open{
>     country : string,
>     country_code : string,
>     full_name : string,
>     id : string,
>     name : string,
>     place_type : string,
>     bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
>     stateID: int32,
>     stateName: string,
>     countyID: int32,
>     countyName: string,
>     cityID: int32?,
>     cityName: string?
> };
> create type typeTweet if not exists as open {
>     create_at : datetime,
>     id: int64,
>     text: string,
>     in_reply_to_status : int64,
>     in_reply_to_user : int64,
>     favorite_count : int64,
>     coordinate: point?,
>     retweet_count : int64,
>     lang : string,
>     is_retweet: boolean,
>     hashtags : {{ string }} ?,
>     user_mentions : {{ int64 }} ? ,
>     user : typeUser,
>     place : typePlace?,
>     geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id;
> create index create_at_status_count_idx on ds_tweet(user.create_at, 
> user.statues_count);
> select value count(first.create_at) from (
> select t.user.create_at, t.user.statues_count, t.id from ds_tweet t
> where
>       t.user.create_at   >=
>       date_from_unix_time_in_days(1) and
>       t.user.create_at   <
>       date_from_unix_time_in_days(12000) and
>           t.user.statues_count  >= 0 and
>           t.user.statues_count  <  100
> ) first;
> {code}
>  
> Exception:
> {code:java}
> org.apache.hyracks.algebricks.common.exceptions.AlgebricksException: Could 
> not infer type for variable '$$57'.
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitVariableReferenceExpression(SetClosedRecordConstructorsRule.java:170)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitVariableReferenceExpression(SetClosedRecordConstructorsRule.java:77)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.core.algebra.expressions.VariableReferenceExpression.accept(VariableReferenceExpression.java:93)
>  ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:150)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:77)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.core.algebra.visitors.AbstractConstVarFunVisitor.visitScalarFunctionCallExpression(AbstractConstVarFunVisitor.java:39)
>  ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.algebricks.core.algebra.expressions.ScalarFunctionCallExpression.accept(ScalarFunctionCallExpression.java:55)
>  ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:150)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:77)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> 

[jira] [Created] (ASTERIXDB-2437) Index-only plan cannot be generated for a composite index if both fields are used and only one field is returned.

2018-08-07 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2437:
-

 Summary: Index-only plan cannot be generated for a composite index 
if both fields are used and only one field is returned.
 Key: ASTERIXDB-2437
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2437
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


When there is a composite index,  if both fields are used in the select 
condition and only one field is returned, the compiler generates an exception.

 

DDL and query:
{code:java}
drop dataverse twitter if exists;
create dataverse twitter if not exists;
use twitter;

create type typeUser if not exists as open {
    id: int64,
    name: string,
    screen_name : string,
    profile_image_url : string,
    lang : string,
    location: string,
    create_at: date,
    description: string,
    followers_count: int32,
    friends_count: int32,
    statues_count: int64
};

create type typePlace if not exists as open{
    country : string,
    country_code : string,
    full_name : string,
    id : string,
    name : string,
    place_type : string,
    bounding_box : rectangle
};

create type typeGeoTag if not exists as open {
    stateID: int32,
    stateName: string,
    countyID: int32,
    countyName: string,
    cityID: int32?,
    cityName: string?
};

create type typeTweet if not exists as open {
    create_at : datetime,
    id: int64,
    text: string,
    in_reply_to_status : int64,
    in_reply_to_user : int64,
    favorite_count : int64,
    coordinate: point?,
    retweet_count : int64,
    lang : string,
    is_retweet: boolean,
    hashtags : {{ string }} ?,
    user_mentions : {{ int64 }} ? ,
    user : typeUser,
    place : typePlace?,
    geo_tag: typeGeoTag
};

create dataset ds_tweet(typeTweet) if not exists primary key id;

create index create_at_status_count_idx on ds_tweet(user.create_at, 
user.statues_count);

select value count(first.create_at) from (
select t.user.create_at, t.user.statues_count, t.id from ds_tweet t
where
      t.user.create_at   >=
      date_from_unix_time_in_days(1) and
      t.user.create_at   <
      date_from_unix_time_in_days(12000) and
          t.user.statues_count  >= 0 and
          t.user.statues_count  <  100
) first;
{code}
 

Exception:
{code:java}
org.apache.hyracks.algebricks.common.exceptions.AlgebricksException: Could not 
infer type for variable '$$57'.
at 
org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitVariableReferenceExpression(SetClosedRecordConstructorsRule.java:170)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitVariableReferenceExpression(SetClosedRecordConstructorsRule.java:77)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.core.algebra.expressions.VariableReferenceExpression.accept(VariableReferenceExpression.java:93)
 ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:150)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:77)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.core.algebra.visitors.AbstractConstVarFunVisitor.visitScalarFunctionCallExpression(AbstractConstVarFunVisitor.java:39)
 ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.core.algebra.expressions.ScalarFunctionCallExpression.accept(ScalarFunctionCallExpression.java:55)
 ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:150)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.visitFunctionCallExpression(SetClosedRecordConstructorsRule.java:77)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.core.algebra.visitors.AbstractConstVarFunVisitor.visitScalarFunctionCallExpression(AbstractConstVarFunVisitor.java:39)
 ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.core.algebra.expressions.ScalarFunctionCallExpression.accept(ScalarFunctionCallExpression.java:55)
 ~[algebricks-core-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.SetClosedRecordConstructorsRule$SettingClosedRecordVisitor.transform(SetClosedRecordConstructorsRule.java:91)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 

[jira] [Commented] (ASTERIXDB-2125) NotImplementedException when Bulk Load LSMRTree

2018-08-02 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567403#comment-16567403
 ] 

Taewoo Kim commented on ASTERIXDB-2125:
---

The problem is still happening.

> NotImplementedException when Bulk Load LSMRTree
> ---
>
> Key: ASTERIXDB-2125
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2125
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: IDX - Indexes
>Reporter: Chen Luo
>Priority: Major
>
> When using the twitter dataset (same as the one used by Cloudberry), creating 
> a new LSM RTree index throws NotImplementedException (both for datasets using 
> prefix or correlated merge policy).
> Stack trace:
> {code}
>  org.apache.hyracks.algebricks.common.exceptions.NotImplementedException: 
> Value provider for type missing is not implemented
> org.apache.hyracks.api.exceptions.HyracksDataException: 
> org.apache.hyracks.algebricks.common.exceptions.NotImplementedException: 
> Value provider for type missing is not implemented
>   at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:134)
>   at 
> org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:63)
>   at org.apache.hyracks.control.nc.Task.run(Task.java:367)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: 
> org.apache.hyracks.algebricks.common.exceptions.NotImplementedException: 
> Value provider for type missing is not implemented
>   at 
> org.apache.asterix.dataflow.data.nontagged.valueproviders.PrimitiveValueProviderFactory$1.getValue(PrimitiveValueProviderFactory.java:60)
>   at 
> org.apache.hyracks.storage.am.rtree.frames.RTreeNSMFrame.calculateMBRImpl(RTreeNSMFrame.java:131)
>   at 
> org.apache.hyracks.storage.am.rtree.frames.RTreeNSMFrame.adjustMBR(RTreeNSMFrame.java:152)
>   at 
> org.apache.hyracks.storage.am.rtree.impls.RTree$RTreeBulkLoader.propagateBulk(RTree.java:1047)
>   at 
> org.apache.hyracks.storage.am.rtree.impls.RTree$RTreeBulkLoader.add(RTree.java:948)
>   at 
> org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMDiskComponentBulkLoader.add(AbstractLSMDiskComponentBulkLoader.java:91)
>   at 
> org.apache.hyracks.storage.am.lsm.rtree.impls.LSMRTreeWithAntiMatterTuples$LSMRTreeWithAntiMatterTuplesBulkLoader.add(LSMRTreeWithAntiMatterTuples.java:292)
>   at 
> org.apache.hyracks.storage.am.common.dataflow.IndexBulkLoadOperatorNodePushable.nextFrame(IndexBulkLoadOperatorNodePushable.java:81)
>   at 
> org.apache.hyracks.api.dataflow.EnforceFrameWriter.nextFrame(EnforceFrameWriter.java:76)
>   at 
> org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.write(AbstractFrameAppender.java:93)
>   at 
> org.apache.hyracks.dataflow.common.comm.util.FrameUtils.appendToWriter(FrameUtils.java:121)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.flush(AbstractFrameSorter.java:172)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractExternalSortRunMerger.process(AbstractExternalSortRunMerger.java:90)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:181)
>   at 
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:204)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> {code}
> This should be a bug within RTree, because when we perform bulk load, we 
> first filter out all entries with missing or null secondary keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2409) Full-text search returns a subset of true results (only one frame) for multiple keywords queries if the result size is greater than one frame.

2018-07-10 Thread Taewoo Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2409.
-
Resolution: Fixed

> Full-text search returns a subset of true results (only one frame) for 
> multiple keywords queries if the result size is greater than one frame.
> --
>
> Key: ASTERIXDB-2409
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2409
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> Currently, the full-text search returns a subset of true results (only one 
> frame) for multiple keywords queries if the result size is greater than one 
> frame.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2409) Full-text search returns a subset of true results (only one frame) for multiple keywords queries if the result size is greater than one frame.

2018-07-09 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2409:
-

 Summary: Full-text search returns a subset of true results (only 
one frame) for multiple keywords queries if the result size is greater than one 
frame.
 Key: ASTERIXDB-2409
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2409
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim
Assignee: Taewoo Kim


Currently, the full-text search returns a subset of true results (only one 
frame) for multiple keywords queries if the result size is greater than one 
frame.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2407) An AsterixDB instance isn't able to answer queries. But the status seems fine.

2018-07-07 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2407:
-

 Summary: An AsterixDB instance isn't able to answer queries. But 
the status seems fine.
 Key: ASTERIXDB-2407
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2407
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim
 Attachments: cc_last_2m_lines.log.gz, nc-1.log.gz

This happened on the Cloudberry cluster.

 

1. A simple count query against a dataset (cardinality less than 1000) took 
more than five minutes.
{code:java}
select count(*) from simple-dataset;{code}
 

2. The status URL said the status was ACTIVE.
{code:java}
x.x.x.x.:19002/admin/cluster{code}
 

3. The log files seemed OK. See attached the log files.

 

4. The ingestion process was active but was not able to ingest any record. That 
is, the number did not increase.
{code:java}
x.x.x.x:19002/admin/active{code}
 

5. Stop feed did not succeed. The cluster did not respond.
{code:java}
stop feed ;{code}
 

6. Around 22:37 yesterday, restarted DB. See attached log files.

 

[^cc_last_2m_lines.log.gz]

[^nc-1.log.gz]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2390) Cannot resolve ambiguous alias reference

2018-05-22 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484569#comment-16484569
 ] 

Taewoo Kim commented on ASTERIXDB-2390:
---

Got it. Thanks!!

> Cannot resolve ambiguous alias reference
> 
>
> Key: ASTERIXDB-2390
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2390
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
>
> The following query generates a CompilationException.
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
> id: int64,
> name: string,
> screen_name : string,
> profile_image_url : string,
> lang : string,
> location: string,
> create_at: date,
> description: string,
> followers_count: int32,
> friends_count: int32,
> statues_count: int64
> };
> create type typePlace if not exists as open{
> country : string,
> country_code : string,
> full_name : string,
> id : string,
> name : string,
> place_type : string,
> bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
> stateID: int32,
> stateName: string,
> countyID: int32,
> countyName: string,
> cityID: int32?,
> cityName: string?
> };
> create type typeTweet if not exists as open {
> create_at : datetime,
> id: int64,
> text: string,
> in_reply_to_status : int64,
> in_reply_to_user : int64,
> favorite_count : int64,
> coordinate: point?,
> retweet_count : int64,
> lang : string,
> is_retweet: boolean,
> hashtags : {{ string }} ?,
> user_mentions : {{ int64 }} ? ,
> user : typeUser,
> place : typePlace?,
> geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter 
> on create_at with 
> {"merge-policy":{"name":"prefix","parameters":{"max-mergable-component-size":536870912,
>  "max-tolerance-component-count":5}}};
> create index text_idx if not exists on ds_tweet(text) type fulltext;{code}
>  
> Query:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY cell;{code}
>  
> Exception:
> {code:java}
> Cannot resolve ambiguous alias reference for undefined identifier t in [#1, 
> $cell] [CompilationException]
> {code}
>  
> The same query that doesn't generate an exception:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0), count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0);
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2390) Cannot resolve ambiguous alias reference

2018-05-22 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2390.
-
Resolution: Invalid

> Cannot resolve ambiguous alias reference
> 
>
> Key: ASTERIXDB-2390
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2390
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
>
> The following query generates a CompilationException.
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
> id: int64,
> name: string,
> screen_name : string,
> profile_image_url : string,
> lang : string,
> location: string,
> create_at: date,
> description: string,
> followers_count: int32,
> friends_count: int32,
> statues_count: int64
> };
> create type typePlace if not exists as open{
> country : string,
> country_code : string,
> full_name : string,
> id : string,
> name : string,
> place_type : string,
> bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
> stateID: int32,
> stateName: string,
> countyID: int32,
> countyName: string,
> cityID: int32?,
> cityName: string?
> };
> create type typeTweet if not exists as open {
> create_at : datetime,
> id: int64,
> text: string,
> in_reply_to_status : int64,
> in_reply_to_user : int64,
> favorite_count : int64,
> coordinate: point?,
> retweet_count : int64,
> lang : string,
> is_retweet: boolean,
> hashtags : {{ string }} ?,
> user_mentions : {{ int64 }} ? ,
> user : typeUser,
> place : typePlace?,
> geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter 
> on create_at with 
> {"merge-policy":{"name":"prefix","parameters":{"max-mergable-component-size":536870912,
>  "max-tolerance-component-count":5}}};
> create index text_idx if not exists on ds_tweet(text) type fulltext;{code}
>  
> Query:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY cell;{code}
>  
> Exception:
> {code:java}
> Cannot resolve ambiguous alias reference for undefined identifier t in [#1, 
> $cell] [CompilationException]
> {code}
>  
> The same query that doesn't generate an exception:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0), count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0);
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ASTERIXDB-2390) Cannot resolve ambiguous alias reference

2018-05-22 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484459#comment-16484459
 ] 

Taewoo Kim edited comment on ASTERIXDB-2390 at 5/22/18 7:36 PM:


I just checked the appendiex_3 and I am still confused. What do you mean by "t 
is not in scope after the GROUP BY"? You mean operators after the GROUP BY or 
literally means XXX in "GROUP BY XXX"? 

Could you revise the first SQLPP query to make it work using an alias? Even if 
I don't use `t`, it generates a similar error.


was (Author: wangsaeu):
I just checked the appendiex_3 and I am still confused. What do you mean by "t 
is not in scope after the GROUP BY"? You mean operators after the GROUP BY or 
literally means XXX in "GROUP BY XXX"? 

Could you revise the first SQLPP query to make it work using an alias? Even if 
I don't use `t`, it generates a similar error

> Cannot resolve ambiguous alias reference
> 
>
> Key: ASTERIXDB-2390
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2390
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
>
> The following query generates a CompilationException.
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
> id: int64,
> name: string,
> screen_name : string,
> profile_image_url : string,
> lang : string,
> location: string,
> create_at: date,
> description: string,
> followers_count: int32,
> friends_count: int32,
> statues_count: int64
> };
> create type typePlace if not exists as open{
> country : string,
> country_code : string,
> full_name : string,
> id : string,
> name : string,
> place_type : string,
> bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
> stateID: int32,
> stateName: string,
> countyID: int32,
> countyName: string,
> cityID: int32?,
> cityName: string?
> };
> create type typeTweet if not exists as open {
> create_at : datetime,
> id: int64,
> text: string,
> in_reply_to_status : int64,
> in_reply_to_user : int64,
> favorite_count : int64,
> coordinate: point?,
> retweet_count : int64,
> lang : string,
> is_retweet: boolean,
> hashtags : {{ string }} ?,
> user_mentions : {{ int64 }} ? ,
> user : typeUser,
> place : typePlace?,
> geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter 
> on create_at with 
> {"merge-policy":{"name":"prefix","parameters":{"max-mergable-component-size":536870912,
>  "max-tolerance-component-count":5}}};
> create index text_idx if not exists on ds_tweet(text) type fulltext;{code}
>  
> Query:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY cell;{code}
>  
> Exception:
> {code:java}
> Cannot resolve ambiguous alias reference for undefined identifier t in [#1, 
> $cell] [CompilationException]
> {code}
>  
> The same query that doesn't generate an exception:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0), count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0);
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ASTERIXDB-2390) Cannot resolve ambiguous alias reference

2018-05-22 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484459#comment-16484459
 ] 

Taewoo Kim edited comment on ASTERIXDB-2390 at 5/22/18 7:36 PM:


I just checked the appendiex_3 and I am still confused. What do you mean by "t 
is not in scope after the GROUP BY"? You mean operators after the GROUP BY or 
literally means XXX in "GROUP BY XXX"? 

Could you revise the first SQLPP query to make it work using an alias? Even if 
I don't use `t`, it generates a similar error


was (Author: wangsaeu):
I just checked the appendiex_3 and I am still confused. What do you mean by "t 
is not in scope after the GROUP BY"? You mean operators after the GROUP BY or 
literally means XXX in "GROUP BY XXX"? 

Could you revise the first statement? Even if I don't use t, it generates a 
similar error

> Cannot resolve ambiguous alias reference
> 
>
> Key: ASTERIXDB-2390
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2390
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
>
> The following query generates a CompilationException.
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
> id: int64,
> name: string,
> screen_name : string,
> profile_image_url : string,
> lang : string,
> location: string,
> create_at: date,
> description: string,
> followers_count: int32,
> friends_count: int32,
> statues_count: int64
> };
> create type typePlace if not exists as open{
> country : string,
> country_code : string,
> full_name : string,
> id : string,
> name : string,
> place_type : string,
> bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
> stateID: int32,
> stateName: string,
> countyID: int32,
> countyName: string,
> cityID: int32?,
> cityName: string?
> };
> create type typeTweet if not exists as open {
> create_at : datetime,
> id: int64,
> text: string,
> in_reply_to_status : int64,
> in_reply_to_user : int64,
> favorite_count : int64,
> coordinate: point?,
> retweet_count : int64,
> lang : string,
> is_retweet: boolean,
> hashtags : {{ string }} ?,
> user_mentions : {{ int64 }} ? ,
> user : typeUser,
> place : typePlace?,
> geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter 
> on create_at with 
> {"merge-policy":{"name":"prefix","parameters":{"max-mergable-component-size":536870912,
>  "max-tolerance-component-count":5}}};
> create index text_idx if not exists on ds_tweet(text) type fulltext;{code}
>  
> Query:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY cell;{code}
>  
> Exception:
> {code:java}
> Cannot resolve ambiguous alias reference for undefined identifier t in [#1, 
> $cell] [CompilationException]
> {code}
>  
> The same query that doesn't generate an exception:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0), count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0);
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2390) Cannot resolve ambiguous alias reference

2018-05-22 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484459#comment-16484459
 ] 

Taewoo Kim commented on ASTERIXDB-2390:
---

I just checked the appendiex_3 and I am still confused. What do you mean by "t 
is not in scope after the GROUP BY"? You mean operators after the GROUP BY or 
literally means XXX in "GROUP BY XXX"? 

Could you revise the first statement? Even if I don't use t, it generates a 
similar error

> Cannot resolve ambiguous alias reference
> 
>
> Key: ASTERIXDB-2390
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2390
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
>
> The following query generates a CompilationException.
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
> id: int64,
> name: string,
> screen_name : string,
> profile_image_url : string,
> lang : string,
> location: string,
> create_at: date,
> description: string,
> followers_count: int32,
> friends_count: int32,
> statues_count: int64
> };
> create type typePlace if not exists as open{
> country : string,
> country_code : string,
> full_name : string,
> id : string,
> name : string,
> place_type : string,
> bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
> stateID: int32,
> stateName: string,
> countyID: int32,
> countyName: string,
> cityID: int32?,
> cityName: string?
> };
> create type typeTweet if not exists as open {
> create_at : datetime,
> id: int64,
> text: string,
> in_reply_to_status : int64,
> in_reply_to_user : int64,
> favorite_count : int64,
> coordinate: point?,
> retweet_count : int64,
> lang : string,
> is_retweet: boolean,
> hashtags : {{ string }} ?,
> user_mentions : {{ int64 }} ? ,
> user : typeUser,
> place : typePlace?,
> geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter 
> on create_at with 
> {"merge-policy":{"name":"prefix","parameters":{"max-mergable-component-size":536870912,
>  "max-tolerance-component-count":5}}};
> create index text_idx if not exists on ds_tweet(text) type fulltext;{code}
>  
> Query:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY cell;{code}
>  
> Exception:
> {code:java}
> Cannot resolve ambiguous alias reference for undefined identifier t in [#1, 
> $cell] [CompilationException]
> {code}
>  
> The same query that doesn't generate an exception:
> {code:java}
> USE twitter;
> SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0), count(*) AS cnt FROM ds_tweet t
> WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box 
> IS NOT unknown 
> AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
> datetime('2017-02-26T00:00:00') 
> GROUP BY spatial_cell(get_points(t.place.bounding_box)[0], 
> create_point(0.0,0.0),1.0,1.0);
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2390) Cannot resolve ambiguous alias reference

2018-05-21 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2390:
-

 Summary: Cannot resolve ambiguous alias reference
 Key: ASTERIXDB-2390
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2390
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


The following query generates a CompilationException.
{code:java}
drop dataverse twitter if exists;
create dataverse twitter if not exists;
use twitter;

create type typeUser if not exists as open {
id: int64,
name: string,
screen_name : string,
profile_image_url : string,
lang : string,
location: string,
create_at: date,
description: string,
followers_count: int32,
friends_count: int32,
statues_count: int64
};

create type typePlace if not exists as open{
country : string,
country_code : string,
full_name : string,
id : string,
name : string,
place_type : string,
bounding_box : rectangle
};

create type typeGeoTag if not exists as open {
stateID: int32,
stateName: string,
countyID: int32,
countyName: string,
cityID: int32?,
cityName: string?
};

create type typeTweet if not exists as open {
create_at : datetime,
id: int64,
text: string,
in_reply_to_status : int64,
in_reply_to_user : int64,
favorite_count : int64,
coordinate: point?,
retweet_count : int64,
lang : string,
is_retweet: boolean,
hashtags : {{ string }} ?,
user_mentions : {{ int64 }} ? ,
user : typeUser,
place : typePlace?,
geo_tag: typeGeoTag
};

create dataset ds_tweet(typeTweet) if not exists primary key id with filter on 
create_at with 
{"merge-policy":{"name":"prefix","parameters":{"max-mergable-component-size":536870912,
 "max-tolerance-component-count":5}}};

create index text_idx if not exists on ds_tweet(text) type fulltext;{code}
 

Query:
{code:java}
USE twitter;
SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet t
WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box IS 
NOT unknown 
AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
datetime('2017-02-26T00:00:00') 
GROUP BY cell;{code}
 

Exception:
{code:java}
Cannot resolve ambiguous alias reference for undefined identifier t in [#1, 
$cell] [CompilationException]
{code}
 

The same query that doesn't generate an exception:
{code:java}
USE twitter;
SELECT spatial_cell(get_points(t.place.bounding_box)[0], 
create_point(0.0,0.0),1.0,1.0), count(*) AS cnt FROM ds_tweet t
WHERE ftcontains(t.text, ['rain'], {'mode':'any'}) AND t.place.bounding_box IS 
NOT unknown 
AND t.create_at >= datetime('2017-02-25T00:00:00') AND t.create_at < 
datetime('2017-02-26T00:00:00') 
GROUP BY spatial_cell(get_points(t.place.bounding_box)[0], 
create_point(0.0,0.0),1.0,1.0);
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2382) An NC instance becomes FAILED status after generating "TCP read error"

2018-05-02 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2382:
-

 Summary: An NC instance becomes FAILED status after generating 
"TCP read error"
 Key: ASTERIXDB-2382
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2382
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


An NC belongs to the Cloudberry cluster becomes FAILED status after generating 
the following exceptions.

 
{code:java}
00:08:07.406 [TCPEndpoint IO Thread] ERROR 
org.apache.hyracks.net.protocols.tcp.TCPEndpoint - Unexpected tcp io error
java.io.IOException: Invalid argument
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0]
at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[?:1.8.0]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375) ~[?:1.8.0]
at 
org.apache.hyracks.net.protocols.muxdemux.FullFrameChannelReadInterface.read(FullFrameChannelReadInterface.java:72)
 ~[hyracks-net-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.net.protocols.muxdemux.ChannelControlBlock.read(ChannelControlBlock.java:90)
 ~[hyracks-net-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.driveReaderStateMachine(MultiplexedConnection.java:418)
 ~[hyracks-net-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.net.protocols.muxdemux.MultiplexedConnection.notifyIOReady(MultiplexedConnection.java:132)
 ~[hyracks-net-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.net.protocols.tcp.TCPEndpoint$IOThread.run(TCPEndpoint.java:179)
 [hyracks-net-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
00:08:07.464 [Executor-209108:4] WARN org.apache.hyracks.control.nc.Task - Task 
TAID:TID:ANID:ODID:1:1:6:0 failed with exception
org.apache.hyracks.api.exceptions.HyracksDataException: 
java.nio.channels.CancelledKeyException
at 
org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:51)
 ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.std.connectors.PartitionDataWriter.wrapException(PartitionDataWriter.java:178)
 ~[hyracks-dataflow-std-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.std.connectors.PartitionDataWriter.close(PartitionDataWriter.java:87)
 ~[hyracks-dataflow-std-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:62)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.close(AlgebricksMetaOperatorDescriptor.java:156)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.close(OptimizedHybridHashJoinOperatorDescriptor.java:465)
 ~[hyracks-dataflow-std-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:410) 
~[hyracks-control-nc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at org.apache.hyracks.control.nc.Task.run(Task.java:324) 
[hyracks-control-nc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0]
at java.lang.Thread.run(Thread.java:744) [?:1.8.0]
Suppressed: org.apache.hyracks.api.exceptions.HyracksDataException: 
java.nio.channels.CancelledKeyException
at 
org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:51)
 ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.std.connectors.PartitionDataWriter.wrapException(PartitionDataWriter.java:178)
 ~[hyracks-dataflow-std-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.std.connectors.PartitionDataWriter.fail(PartitionDataWriter.java:149)
 ~[hyracks-dataflow-std-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.std.connectors.PartitionDataWriter.close(PartitionDataWriter.java:93)
 ~[hyracks-dataflow-std-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:62)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.close(AlgebricksMetaOperatorDescriptor.java:156)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.close(OptimizedHybridHashJoinOperatorDescriptor.java:465)
 

[jira] [Updated] (ASTERIXDB-2374) Index-only plan does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2374:
--
Description: 
Currently, the index-only plan is generated, but none of the search result goes 
to the index-only-plan path. That is, every search result from a secondary 
index search is fed into the primary index.

 

(EDIT) The cause is that if we fetch a tuple from disk, the result of 
searchCallback.proceed() always needs to be set to true. And it isn't now. 

  was:
Currently, the index-only plan is generated, but none of search result goes to 
the index-only-plan path. That is, every search result from a secondary index 
search is fed into the primary index.

 

The cause is that after loading tuples from in-memory search cursor to a 
priority queue, we set a boolean variable "includeMutableComponents" to false. 
Thus, when LSMBTreeRangeSearchCursor conducts a search, it thinks that it is 
fetching tuples from the disk and does not apply "searchCallback.proceed()". 
The index-only plan relies on this result and it is always set to false. 


> Index-only plan does not work as expected.
> --
>
> Key: ASTERIXDB-2374
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2374
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> Currently, the index-only plan is generated, but none of the search result 
> goes to the index-only-plan path. That is, every search result from a 
> secondary index search is fed into the primary index.
>  
> (EDIT) The cause is that if we fetch a tuple from disk, the result of 
> searchCallback.proceed() always needs to be set to true. And it isn't now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2372) Providing a float value predicate to an integer primary index does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454858#comment-16454858
 ] 

Taewoo Kim commented on ASTERIXDB-2372:
---

id = 1.0 is a valid predicate for an equality search on an integer index. So, 
id >= 1 and id <=1 will return one result (id = 1).

 

But, id = 1.3 is not a valid predicate. However, id >=1 and id <=2 return two 
results (id =1 and id = 2). Thus, we need to remove the inclusiveness option.

 

So, in short, if ceil( x ) = floor ( x ), we maintain the current framework (id 
>= floor( x ) and id <= ceil ( x )). If not, we remove the inclusiveness option.

> Providing a float value predicate to an integer primary index does not work 
> as expected.
> 
>
> Key: ASTERIXDB-2372
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2372
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> If we have an integer primary index and feed a float value predicate that is 
> not an integer such as 1.3, the search result is not correct.
>  
> The DDL and DML
> {code:java}
> drop dataverse test if exists;
> create dataverse test;
> use test;
> create type MyRecord as closed {
>   id: int64
> };
> create dataset MyData(MyRecord) primary key id;
> insert into MyData({"id":1});
> insert into MyData({"id":2});
> select * from MyData where id = 1.3;{code}
>  
> The result should be empty. But, it returns 1 and 2 as the result.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2372) Providing a float value predicate to an integer primary index does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454837#comment-16454837
 ] 

Taewoo Kim commented on ASTERIXDB-2372:
---

It's case #1 - we maintain the inclusiveness option. (id >=1 and id <= 1)

> Providing a float value predicate to an integer primary index does not work 
> as expected.
> 
>
> Key: ASTERIXDB-2372
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2372
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> If we have an integer primary index and feed a float value predicate that is 
> not an integer such as 1.3, the search result is not correct.
>  
> The DDL and DML
> {code:java}
> drop dataverse test if exists;
> create dataverse test;
> use test;
> create type MyRecord as closed {
>   id: int64
> };
> create dataset MyData(MyRecord) primary key id;
> insert into MyData({"id":1});
> insert into MyData({"id":2});
> select * from MyData where id = 1.3;{code}
>  
> The result should be empty. But, it returns 1 and 2 as the result.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ASTERIXDB-2374) Index-only plan does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2374:
--
Summary: Index-only plan does not work as expected.  (was: Index-only plan 
does work as expected.)

> Index-only plan does not work as expected.
> --
>
> Key: ASTERIXDB-2374
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2374
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Critical
>
> Currently, the index-only plan is generated, but none of search result goes 
> to the index-only-plan path. That is, every search result from a secondary 
> index search is fed into the primary index.
>  
> The cause is that after loading tuples from in-memory search cursor to a 
> priority queue, we set a boolean variable "includeMutableComponents" to 
> false. Thus, when LSMBTreeRangeSearchCursor conducts a search, it thinks that 
> it is fetching tuples from the disk and does not apply 
> "searchCallback.proceed()". The index-only plan relies on this result and it 
> is always set to false. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ASTERIXDB-2374) Index-only plan does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2374:
-

Assignee: Taewoo Kim

> Index-only plan does not work as expected.
> --
>
> Key: ASTERIXDB-2374
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2374
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> Currently, the index-only plan is generated, but none of search result goes 
> to the index-only-plan path. That is, every search result from a secondary 
> index search is fed into the primary index.
>  
> The cause is that after loading tuples from in-memory search cursor to a 
> priority queue, we set a boolean variable "includeMutableComponents" to 
> false. Thus, when LSMBTreeRangeSearchCursor conducts a search, it thinks that 
> it is fetching tuples from the disk and does not apply 
> "searchCallback.proceed()". The index-only plan relies on this result and it 
> is always set to false. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2374) Index-only plan does work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2374:
-

 Summary: Index-only plan does work as expected.
 Key: ASTERIXDB-2374
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2374
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


Currently, the index-only plan is generated, but none of search result goes to 
the index-only-plan path. That is, every search result from a secondary index 
search is fed into the primary index.

 

The cause is that after loading tuples from in-memory search cursor to a 
priority queue, we set a boolean variable "includeMutableComponents" to false. 
Thus, when LSMBTreeRangeSearchCursor conducts a search, it thinks that it is 
fetching tuples from the disk and does not apply "searchCallback.proceed()". 
The index-only plan relies on this result and it is always set to false. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ASTERIXDB-2372) Providing a float value predicate to an integer primary index does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454783#comment-16454783
 ] 

Taewoo Kim edited comment on ASTERIXDB-2372 at 4/26/18 8:14 PM:


Fact 1. Btree-search is a range search. We need to provide a low value (L) and 
a high value (H) and their inclusiveness. For example, a genral B-tree search 
query is "id > 1 and id <= 5". In this case, the low value is 1 and the high 
value is 5. Also, the low value should not be included in the result and the 
high value should be included. For an equality-condition search such as "id = 
1" is translated as "id >=1 and id <= 1" to get only one value. 

Fact 2. For the equality-condition search, if a float or double value is 
provided to an integer index, we transform it to two values - floor( x ) and 
ceil( x ). The intuition behind this is that the search only makes sense when 
floor( x ) is equal to ceil( x ). This is only true when a float or double 
value only contains the integral part (e.g., 1.0, 8.0). For other values, ceil( 
x ) is not equal to floor ( x ). 

The cause: the cause is that "id = 1.3" will be translated as "id >=1 and id 
<=2" because floor(1.3) = 1 and ceil(1.3) = 2. So, two tuples are returned 
since AsterixDB removed the SELECT operator that checks "id = 1.3" by providing 
that condition to an index-search. 

The solution is removing the inclusiveness option when a float value is 
translated as two different values when applying ceil() and floor().

Case #1:

query: id = 1.0 (floor(1) = 1 and ceil(1) = 2)

internal: id >=1 and id <= 1   : maintained the inclusiveness

 

Case #2:

query: id = 1.3 (floor(1.3) = 1 and ceil(1.3) = 2)

internal: id > 1 and id < 2   : removed the inclusiveness

 

 

This can be also true for a BTree secondary index but it is not observed since:

(1) Index-only plan does not work properly now. The plan is generated 
correctly. But, every tuple goes to the primary index search because of 
handling issue in the LSMBTReeRangeSearchCursor.

(2) A select operator will be applied for a non-index-only plan case.

 

 

We need to fix two things:

(1) Make the index-only plan work

(2) Remove the inclusiveness option when ceil(x) is not equal to floor(x). 

 


was (Author: wangsaeu):
Fact 1. Btree-search is a range search. We need to provide a low value (L) and 
a high value (H) and their inclusiveness. For example, a genral B-tree search 
query is "id > 1 and id <= 5". In this case, the low value is 1 and the high 
value is 5. Also, the low value should not be included in the result and the 
high value should be included. For an equality-condition search such as "id = 
1" is translated as "id >=1 and id <= 1" to get only one value. 

Fact 2. For the equality-condition search, if a float or double value is 
provided to an integer index, we transform it to two values - floor( x ) and 
ceil( x ). The intuition behind this is that the search only makes sense when 
floor( x ) is equal to ceil( x ). This is only true when a float or double 
value only contains the integral part (e.g., 1.0, 8.0). For other values, ceil( 
x ) is not equal to floor ( x ). 

The cause: the cause is that "id = 1.3" will be translated as "id >=1 and id 
<=2" because floor(1.3) = 1 and ceil(1.3) = 2. So, two tuples are returned 
since AsterixDB removed the SELECT operator that checks "id = 1.3" by providing 
that condition to an index-search. 

The solution is removing the inclusiveness option when a float value is 
translated as two different values when applying ceil() and floor().

Case #1:

query: id = 1.0 (floor(1) = 1 and ceil(1) = 2)

internal: id >=1 and id <= 1   : maintained the inclusiveness

 

Case #2:

query: id = 1.3 (floor(1.3) = 1 and ceil(1.3) = 2)

internal: id > 1 and id < 2   : removed the inclusiveness

 

 

> Providing a float value predicate to an integer primary index does not work 
> as expected.
> 
>
> Key: ASTERIXDB-2372
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2372
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> If we have an integer primary index and feed a float value predicate that is 
> not an integer such as 1.3, the search result is not correct.
>  
> The DDL and DML
> {code:java}
> drop dataverse test if exists;
> create dataverse test;
> use test;
> create type MyRecord as closed {
>   id: int64
> };
> create dataset MyData(MyRecord) primary key id;
> insert into MyData({"id":1});
> insert into MyData({"id":2});
> select * from MyData where id = 1.3;{code}
>  
> The result should be empty. But, it returns 1 and 2 as the result.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ASTERIXDB-2372) Providing a float value predicate to an integer primary index does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454783#comment-16454783
 ] 

Taewoo Kim edited comment on ASTERIXDB-2372 at 4/26/18 8:11 PM:


Fact 1. Btree-search is a range search. We need to provide a low value (L) and 
a high value (H) and their inclusiveness. For example, a genral B-tree search 
query is "id > 1 and id <= 5". In this case, the low value is 1 and the high 
value is 5. Also, the low value should not be included in the result and the 
high value should be included. For an equality-condition search such as "id = 
1" is translated as "id >=1 and id <= 1" to get only one value. 

Fact 2. For the equality-condition search, if a float or double value is 
provided to an integer index, we transform it to two values - floor( x ) and 
ceil( x ). The intuition behind this is that the search only makes sense when 
floor( x ) is equal to ceil( x ). This is only true when a float or double 
value only contains the integral part (e.g., 1.0, 8.0). For other values, ceil( 
x ) is not equal to floor ( x ). 

The cause: the cause is that "id = 1.3" will be translated as "id >=1 and id 
<=2" because floor(1.3) = 1 and ceil(1.3) = 2. So, two tuples are returned 
since AsterixDB removed the SELECT operator that checks "id = 1.3" by providing 
that condition to an index-search. 

The solution is removing the inclusiveness option when a float value is 
translated as two different values when applying ceil() and floor().

Case #1:

query: id = 1.0 (floor(1) = 1 and ceil(1) = 2)

internal: id >=1 and id <= 1   : maintained the inclusiveness

 

Case #2:

query: id = 1.3 (floor(1.3) = 1 and ceil(1.3) = 2)

internal: id > 1 and id < 2   : removed the inclusiveness

 

 


was (Author: wangsaeu):
Fact 1. Btree-search is a range search. We need to provide a low value (L) and 
a high value (H) and their inclusiveness. For example, a genral B-tree search 
query is "id > 1 and id <= 5". In this case, the low value is 1 and the high 
value is 5. Also, the low value should not be included in the result and the 
high value should be included. For an equality-condition search such as "id = 
1" is translated as "id >=1 and id <= 1" to get only one value. 

Fact 2. For the equality-condition search, if a float or double value is 
provided to an integer index, we transform it to two values - floor(x) and 
ceil(x). The intuition behind this is that the search only makes sense when 
floor(x) is equal to ceil(x). This is only true when a float or double value 
only contains the integral part (e.g., 1.0, 8.0). For other values, ceil(x) is 
not equal to floor(x). 

The cause: the cause is that "id = 1.3" will be translated as "id >=1 and id 
<=2" because floor(1.3) = 1 and ceil(1.3) = 2. So, two tuples are returned 
since AsterixDB removed the SELECT operator that checks "id = 1.3" by providing 
that condition to an index-search. 

The solution is removing the inclusiveness option when a float value is 
translated as two different values when applying ceil() and floor().

Case #1:

query: id = 1.0 (floor(1) = 1 and ceil(1) = 2)

internal: id >=1 and id <= 1   : maintained the inclusiveness

 

Case #2:

query: id = 1.3 (floor(1.3) = 1 and ceil(1.3) = 2)

internal: id > 1 and id < 2   : removed the inclusiveness

 

 

> Providing a float value predicate to an integer primary index does not work 
> as expected.
> 
>
> Key: ASTERIXDB-2372
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2372
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> If we have an integer primary index and feed a float value predicate that is 
> not an integer such as 1.3, the search result is not correct.
>  
> The DDL and DML
> {code:java}
> drop dataverse test if exists;
> create dataverse test;
> use test;
> create type MyRecord as closed {
>   id: int64
> };
> create dataset MyData(MyRecord) primary key id;
> insert into MyData({"id":1});
> insert into MyData({"id":2});
> select * from MyData where id = 1.3;{code}
>  
> The result should be empty. But, it returns 1 and 2 as the result.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2372) Providing a float value predicate to an integer primary index does not work as expected.

2018-04-26 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454783#comment-16454783
 ] 

Taewoo Kim commented on ASTERIXDB-2372:
---

Fact 1. Btree-search is a range search. We need to provide a low value (L) and 
a high value (H) and their inclusiveness. For example, a genral B-tree search 
query is "id > 1 and id <= 5". In this case, the low value is 1 and the high 
value is 5. Also, the low value should not be included in the result and the 
high value should be included. For an equality-condition search such as "id = 
1" is translated as "id >=1 and id <= 1" to get only one value. 

Fact 2. For the equality-condition search, if a float or double value is 
provided to an integer index, we transform it to two values - floor(x) and 
ceil(x). The intuition behind this is that the search only makes sense when 
floor(x) is equal to ceil(x). This is only true when a float or double value 
only contains the integral part (e.g., 1.0, 8.0). For other values, ceil(x) is 
not equal to floor(x). 

The cause: the cause is that "id = 1.3" will be translated as "id >=1 and id 
<=2" because floor(1.3) = 1 and ceil(1.3) = 2. So, two tuples are returned 
since AsterixDB removed the SELECT operator that checks "id = 1.3" by providing 
that condition to an index-search. 

The solution is removing the inclusiveness option when a float value is 
translated as two different values when applying ceil() and floor().

Case #1:

query: id = 1.0 (floor(1) = 1 and ceil(1) = 2)

internal: id >=1 and id <= 1   : maintained the inclusiveness

 

Case #2:

query: id = 1.3 (floor(1.3) = 1 and ceil(1.3) = 2)

internal: id > 1 and id < 2   : removed the inclusiveness

 

 

> Providing a float value predicate to an integer primary index does not work 
> as expected.
> 
>
> Key: ASTERIXDB-2372
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2372
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Critical
>
> If we have an integer primary index and feed a float value predicate that is 
> not an integer such as 1.3, the search result is not correct.
>  
> The DDL and DML
> {code:java}
> drop dataverse test if exists;
> create dataverse test;
> use test;
> create type MyRecord as closed {
>   id: int64
> };
> create dataset MyData(MyRecord) primary key id;
> insert into MyData({"id":1});
> insert into MyData({"id":2});
> select * from MyData where id = 1.3;{code}
>  
> The result should be empty. But, it returns 1 and 2 as the result.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2372) Providing a float value predicate to an integer primary index does not work as expected.

2018-04-25 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2372:
-

 Summary: Providing a float value predicate to an integer primary 
index does not work as expected.
 Key: ASTERIXDB-2372
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2372
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim
Assignee: Taewoo Kim


If we have an integer primary index and feed a float value predicate that is 
not an integer such as 1.3, the search result is not correct.

 

The DDL and DML
{code:java}
drop dataverse test if exists;
create dataverse test;
use test;

create type MyRecord as closed {
  id: int64
};

create dataset MyData(MyRecord) primary key id;

insert into MyData({"id":1});
insert into MyData({"id":2});

select * from MyData where id = 1.3;{code}
 

The result should be empty. But, it returns 1 and 2 as the result.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2334) A range-search on a composite index doesn't work as expected.

2018-03-23 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411667#comment-16411667
 ] 

Taewoo Kim commented on ASTERIXDB-2334:
---

https://issues.apache.org/jira/browse/ASTERIXDB-920?jql=project%20%3D%20ASTERIXDB%20AND%20text%20~%20composite

I thought this issue was fixed by maintaining the SELECT operator. Somehow, the 
SELECT operator is not maintained. 

> A range-search on a composite index doesn't work as expected.
> -
>
> Key: ASTERIXDB-2334
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2334
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Dmitry Lychagin
>Priority: Critical
>
> A range-search query on a composite primary-index doesn't work as expected.
>  
> The DDL and INSERT statments
> {code:java}
> DROP DATAVERSE earthquake IF EXISTS;
> CREATE DATAVERSE earthquake;
> USE earthquake;
> CREATE TYPE QzExternalTypeNew AS { 
> stationid: string,
> pointid: string,
> itemid: string,
> samplerate: string,
> startdate: string,
> obsvalue: string
> };
> CREATE DATASET qz9130all(QzExternalTypeNew) PRIMARY KEY 
> stationid,pointid,itemid,samplerate,startdate;
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080509","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080510","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080511","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080512","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080513","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080514","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080515","obsvalue":"9"}
>  );
> {code}
>  
> The query
> {code:java}
> SELECT startdate 
> FROM qz9130all
> WHERE samplerate='01' and stationid='01' and pointid='5' and itemid='9130' 
> and startdate >= '20080510' and startdate < '20080513'
> ORDER BY startdate;{code}
>  
> The result
> {code:java}
> { "startdate": "20080510" }
> { "startdate": "20080511" }
> { "startdate": "20080512" }
> { "startdate": "20080513" }{code}
>  
> The last row should be filtered. As the following plan shows, there's no 
> SELECT operator. The optimizer thinks that the primary-index search can 
> generate the final answer. But, it doesn't. There are false positive results.
> {code:java}
> distribute result [$$25]
> -- DISTRIBUTE_RESULT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$25])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$25] <- [{"startdate": $$32}]
>   -- ASSIGN  |PARTITIONED|
> exchange
> -- SORT_MERGE_EXCHANGE [$$32(ASC) ]  |PARTITIONED|
>   order (ASC, $$32) 
>   -- STABLE_SORT [$$32(ASC)]  |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>   project ([$$32])
>   -- STREAM_PROJECT  |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>   unnest-map [$$28, $$29, $$30, $$31, $$32, $$qz9130all] <- 
> index-search("qz9130all", 0, "earthquake", "qz9130all", FALSE, FALSE, 5, 
> $$38, $$39, $$40, $$41, $$42, 5, $$43, $$44, $$45, $$46, $$47, TRUE, TRUE, 
> TRUE)
>   -- BTREE_SEARCH  |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>   assign [$$38, $$39, $$40, $$41, $$42, $$43, $$44, $$45, 
> $$46, $$47] <- ["01", "5", "9130", "01", "20080510", "01", "5", "9130", "01", 
> "20080513"]
>   -- ASSIGN  |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE  |PARTITIONED|{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2334) A range-search on a composite index doesn't work as expected.

2018-03-23 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411285#comment-16411285
 ] 

Taewoo Kim commented on ASTERIXDB-2334:
---

I think the issue is that the optimizer thinks that it doesn't have to add a 
SELECT operator since all predicates are parts of the primary index. However, a 
composite index range-search doesn't work well if it's not the first field. We 
rely on the SELECT operator to filter out false positive results. 

> A range-search on a composite index doesn't work as expected.
> -
>
> Key: ASTERIXDB-2334
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2334
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Dmitry Lychagin
>Priority: Critical
>
> A range-search query on a composite primary-index doesn't work as expected.
>  
> The DDL and INSERT statments
> {code:java}
> DROP DATAVERSE earthquake IF EXISTS;
> CREATE DATAVERSE earthquake;
> USE earthquake;
> CREATE TYPE QzExternalTypeNew AS { 
> stationid: string,
> pointid: string,
> itemid: string,
> samplerate: string,
> startdate: string,
> obsvalue: string
> };
> CREATE DATASET qz9130all(QzExternalTypeNew) PRIMARY KEY 
> stationid,pointid,itemid,samplerate,startdate;
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080509","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080510","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080511","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080512","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080513","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080514","obsvalue":"9"}
>  );
> INSERT INTO qz9130all( 
> {"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080515","obsvalue":"9"}
>  );
> {code}
>  
> The query
> {code:java}
> SELECT startdate 
> FROM qz9130all
> WHERE samplerate='01' and stationid='01' and pointid='5' and itemid='9130' 
> and startdate >= '20080510' and startdate < '20080513'
> ORDER BY startdate;{code}
>  
> The result
> {code:java}
> { "startdate": "20080510" }
> { "startdate": "20080511" }
> { "startdate": "20080512" }
> { "startdate": "20080513" }{code}
>  
> The last row should be filtered. As the following plan shows, there's no 
> SELECT operator. The optimizer thinks that the primary-index search can 
> generate the final answer. But, it doesn't. There are false positive results.
> {code:java}
> distribute result [$$25]
> -- DISTRIBUTE_RESULT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$25])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$25] <- [{"startdate": $$32}]
>   -- ASSIGN  |PARTITIONED|
> exchange
> -- SORT_MERGE_EXCHANGE [$$32(ASC) ]  |PARTITIONED|
>   order (ASC, $$32) 
>   -- STABLE_SORT [$$32(ASC)]  |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>   project ([$$32])
>   -- STREAM_PROJECT  |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>   unnest-map [$$28, $$29, $$30, $$31, $$32, $$qz9130all] <- 
> index-search("qz9130all", 0, "earthquake", "qz9130all", FALSE, FALSE, 5, 
> $$38, $$39, $$40, $$41, $$42, 5, $$43, $$44, $$45, $$46, $$47, TRUE, TRUE, 
> TRUE)
>   -- BTREE_SEARCH  |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>   assign [$$38, $$39, $$40, $$41, $$42, $$43, $$44, $$45, 
> $$46, $$47] <- ["01", "5", "9130", "01", "20080510", "01", "5", "9130", "01", 
> "20080513"]
>   -- ASSIGN  |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE  |PARTITIONED|{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2338) IllgalArgumentException happens when a page of an inverted list is read concurrently

2018-03-22 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2338:
-

 Summary: IllgalArgumentException happens when a page of an 
inverted list is read concurrently
 Key: ASTERIXDB-2338
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2338
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


If a page of an inverted list is read concurrently at the same time by multiple 
threads, the following exceptions happens. This is because a concurrency 
control when reading a buffer in the buffer cache is not implemented. 
{code:java}
org.apache.hyracks.api.exceptions.HyracksDataException: 
java.lang.IllegalArgumentException
at 
org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:51)
 ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:247)
 ~[hyracks-storage-am-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.write(AbstractFrameAppender.java:93)
 ~[hyracks-dataflow-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushAndReset(AbstractOneInputOneOutputOneFramePushRuntime.java:78)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushIfNotFailed(AbstractOneInputOneOutputOneFramePushRuntime.java:84)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:56)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.close(AssignRuntimeFactory.java:119)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:65)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$SourcePushRuntime.initialize(AlgebricksMetaOperatorDescriptor.java:111)
 ~[algebricks-runtime-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$$Lambda$74/144499656.run(Unknown
 Source) ~[?:?]
at 
org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$9(SuperActivityOperatorNodePushable.java:204)
 ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$$Lambda$76/2082033757.call(Unknown
 Source) ~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0]
at java.lang.Thread.run(Thread.java:744) ~[?:1.8.0]
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.position(Buffer.java:244) ~[?:1.8.0]
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:209) ~[?:1.8.0]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.FixedSizeElementInvertedListCursor.loadPages(FixedSizeElementInvertedListCursor.java:225)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.search.TOccurrenceSearcher.search(TOccurrenceSearcher.java:72)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.search(OnDiskInvertedIndex.java:453)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexSearchCursor.doHasNext(LSMInvertedIndexSearchCursor.java:159)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.EnforcedIndexCursor.hasNext(EnforcedIndexCursor.java:69)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.writeSearchResults(IndexSearchOperatorNodePushable.java:203)
 ~[hyracks-storage-am-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:244)
 ~[hyracks-storage-am-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
... 14 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2334) A range-search on a composite index doesn't work as expected.

2018-03-20 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2334:
-

 Summary: A range-search on a composite index doesn't work as 
expected.
 Key: ASTERIXDB-2334
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2334
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


A range-search query on a composite primary-index doesn't work as expected.

 

The DDL and INSERT statments
{code:java}
DROP DATAVERSE earthquake IF EXISTS;
CREATE DATAVERSE earthquake;
USE earthquake;

CREATE TYPE QzExternalTypeNew AS { 
stationid: string,
pointid: string,
itemid: string,
samplerate: string,
startdate: string,
obsvalue: string
};

CREATE DATASET qz9130all(QzExternalTypeNew) PRIMARY KEY 
stationid,pointid,itemid,samplerate,startdate;

INSERT INTO qz9130all( 
{"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080509","obsvalue":"9"}
 );
INSERT INTO qz9130all( 
{"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080510","obsvalue":"9"}
 );
INSERT INTO qz9130all( 
{"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080511","obsvalue":"9"}
 );
INSERT INTO qz9130all( 
{"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080512","obsvalue":"9"}
 );
INSERT INTO qz9130all( 
{"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080513","obsvalue":"9"}
 );
INSERT INTO qz9130all( 
{"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080514","obsvalue":"9"}
 );
INSERT INTO qz9130all( 
{"stationid":"01","pointid":"5","itemid":"9130","samplerate":"01","startdate":"20080515","obsvalue":"9"}
 );
{code}
 

The query
{code:java}
SELECT startdate 
FROM qz9130all
WHERE samplerate='01' and stationid='01' and pointid='5' and itemid='9130' and 
startdate >= '20080510' and startdate < '20080513'
ORDER BY startdate;{code}
 

The result
{code:java}
{ "startdate": "20080510" }
{ "startdate": "20080511" }
{ "startdate": "20080512" }
{ "startdate": "20080513" }{code}
 

The last row should be filtered. As the following plan shows, there's no SELECT 
operator. The optimizer thinks that the primary-index search can generate the 
final answer. But, it doesn't. There are false positive results.
{code:java}
distribute result [$$25]
-- DISTRIBUTE_RESULT  |PARTITIONED|
  exchange
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
project ([$$25])
-- STREAM_PROJECT  |PARTITIONED|
  assign [$$25] <- [{"startdate": $$32}]
  -- ASSIGN  |PARTITIONED|
exchange
-- SORT_MERGE_EXCHANGE [$$32(ASC) ]  |PARTITIONED|
  order (ASC, $$32) 
  -- STABLE_SORT [$$32(ASC)]  |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
  project ([$$32])
  -- STREAM_PROJECT  |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
  unnest-map [$$28, $$29, $$30, $$31, $$32, $$qz9130all] <- 
index-search("qz9130all", 0, "earthquake", "qz9130all", FALSE, FALSE, 5, $$38, 
$$39, $$40, $$41, $$42, 5, $$43, $$44, $$45, $$46, $$47, TRUE, TRUE, TRUE)
  -- BTREE_SEARCH  |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
  assign [$$38, $$39, $$40, $$41, $$42, $$43, $$44, $$45, 
$$46, $$47] <- ["01", "5", "9130", "01", "20080510", "01", "5", "9130", "01", 
"20080513"]
  -- ASSIGN  |PARTITIONED|
empty-tuple-source
-- EMPTY_TUPLE_SOURCE  |PARTITIONED|{code}
 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2185) Cluster becomes UNUSABLE status after a NC fails to send a job failure.

2018-03-15 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401253#comment-16401253
 ] 

Taewoo Kim commented on ASTERIXDB-2185:
---

[~mhubail] : thanks for the investigation. Unfortunately, there are no more CC 
and NC log records. The entire cluster was wiped out and rebuilt. If this 
happens again, I will attach CC and NC log records. Thanks.

> Cluster becomes UNUSABLE status after a NC fails to send a job failure.
> ---
>
> Key: ASTERIXDB-2185
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2185
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: IDX - Indexes, RT - Runtime
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
>  Labels: triaged
>
> A cluster became UNUSABLE status after a NC failed to send a job failure 
> message. See the exception below.
> {code}
> Dec 03, 2017 6:47:13 PM org.apache.hyracks.control.nc.work.StartTasksWork run
> INFO: Initializing TAID:TID:ANID:ODID:16:0:1:0 -> [Asterix {
>   ets;
>   assign [0, 1, 2] := [Constant, Constant, Constant];
> }, 
> org.apache.hyracks.storage.am.lsm.invertedindex.dataflow.LSMInvertedIndexSearchOperatorDescriptor@23d902c1,
>  org.apache.hyracks.dataflow.std.sort.ExternalSort
> OperatorDescriptor$1@2fc09944]
> Dec 03, 2017 6:47:13 PM 
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1
>  close
> INFO: InitialNumberOfRuns:0
> Dec 03, 2017 6:47:13 PM 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:13:0:1:0
> Dec 03, 2017 6:47:13 PM 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:13:0:0:0
> Dec 03, 2017 6:47:13 PM 
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1
>  close
> INFO: InitialNumberOfRuns:0
> Dec 03, 2017 6:47:13 PM 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:16:0:0:0
> Dec 03, 2017 6:47:13 PM 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:16:0:1:0
> Dec 03, 2017 6:48:02 PM 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: AbortTasks
> Dec 03, 2017 6:48:02 PM org.apache.hyracks.control.nc.work.AbortTasksWork run
> INFO: Aborting Tasks: JID:0:[TAID:TID:ANID:ODID:0:0:0:0, 
> TAID:TID:ANID:ODID:3:0:0:0, TAID:TID:ANID:ODID:3:0:1:0]
> Dec 03, 2017 6:48:02 PM org.apache.hyracks.control.nc.Task run
> WARNING: Task TAID:TID:ANID:ODID:3:0:0:0 failed with exception
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1302)
>   at java.util.concurrent.Semaphore.acquire(Semaphore.java:467)
>   at org.apache.hyracks.control.nc.Task.run(Task.java:325)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:744)
> Dec 03, 2017 6:48:02 PM org.apache.hyracks.control.nc.Task run
> WARNING: Task TAID:TID:ANID:ODID:3:0:1:0 failed with exception
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1302)
>   at java.util.concurrent.Semaphore.acquire(Semaphore.java:467)
>   at org.apache.hyracks.control.nc.Task.run(Task.java:325)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:744)
> Dec 03, 2017 6:48:02 PM 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: NotifyTaskFailure
> Dec 03, 2017 6:48:02 PM 
> org.apache.hyracks.control.nc.work.NotifyTaskFailureWork run
> WARNING: 1 is sending a notification to cc that task 
> TAID:TID:ANID:ODID:3:0:0:0 has failed
> org.apache.hyracks.api.exceptions.HyracksDataException: HYR0003: 
> java.lang.InterruptedException
>   at 
> org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:68)
>   at org.apache.hyracks.control.nc.Task.run(Task.java:367)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: 

[jira] [Commented] (ASTERIXDB-2331) Plan branch repeated

2018-03-15 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401248#comment-16401248
 ] 

Taewoo Kim commented on ASTERIXDB-2331:
---

No problem. It's just a complex query.

> Plan branch repeated
> 
>
> Key: ASTERIXDB-2331
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2331
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler
>Reporter: Wail Alkowaileet
>Assignee: Taewoo Kim
>Priority: Major
>
> I didn't investigate. But it looks like unmaintained Split output.
> DDL
> {noformat}
> DROP DATAVERSE SocialNetworkData IF EXISTS;
> CREATE DATAVERSE SocialNetworkData;
> USE SocialNetworkData;
> create type ChirpMessageType as {
> chirpid: int64,
> send_time: datetime
> };
> create type GleambookUserType as {
> id: int64,
> user_since: datetime
> };
> create type GleambookMessageType as {
> message_id: int64,
> author_id: int64,
> send_time: datetime
> };
> create dataset GleambookMessages(GleambookMessageType)
> primary key message_id;
> create dataset GleambookUsers(GleambookUserType)
> primary key id;
> create dataset ChirpMessages(ChirpMessageType)
> primary key chirpid;
> create index usrSinceIx on GleambookUsers(user_since);
> create index sndTimeIx on ChirpMessages(send_time);
> create index authorIdIx on GleambookMessages(author_id);
> {noformat}
> Query:
> {noformat}
> USE SocialNetworkData;
> EXPLAIN
> SELECT g.message_id
> FROM GleambookUsers as u, GleambookMessages as g
> WHERE u.id/*+indexnl*/ = g.author_id 
> AND u.user_since = datetime("2013-04-16T09:45:46")
> {noformat}
> Plan:
> {noformat}
> distribute result [$$28]
> -- DISTRIBUTE_RESULT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> union ($$54, $$55, $$28)
> -- UNION_ALL  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$54])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$54] <- [{"message_id": $$52}]
>   -- ASSIGN  |PARTITIONED|
> project ([$$52])
> -- STREAM_PROJECT  |PARTITIONED|
>   select (eq($$29, $$53.getField(1)))
>   -- STREAM_SELECT  |PARTITIONED|
> project ([$$29, $$52, $$53])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> unnest-map [$$52, $$53] <- 
> index-search("GleambookMessages", 0, "SocialNetworkData", 
> "GleambookMessages", TRUE, FALSE, 1, $$46, 1, $$46, TRUE, TRUE, TRUE)
> -- BTREE_SEARCH  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$29, $$46])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> split ($$47)
> -- SPLIT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$29, $$46, $$47])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> unnest-map [$$45, $$46, $$47] <- 
> index-search("authorIdIx", 0, "SocialNetworkData", "GleambookMessages", TRUE, 
> TRUE, 1, $$29, 1, $$29, TRUE, TRUE, TRUE)
> -- BTREE_SEARCH  |PARTITIONED|
>   exchange
>   -- BROADCAST_EXCHANGE  |PARTITIONED|
> union ($$43, $$38, $$29)
> -- UNION_ALL  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  
> |PARTITIONED|
> project ([$$43])
> -- STREAM_PROJECT  |PARTITIONED|
>   select (eq($$33, datetime: { 
> 2013-04-16T09:45:46.000Z }))
>   -- STREAM_SELECT  |PARTITIONED|
> project ([$$43, $$33])
> -- STREAM_PROJECT  
> |PARTITIONED|
>   assign [$$33] <- 
> [$$44.getField(1)]
>   -- ASSIGN  |PARTITIONED|
> exchange
> 

[jira] [Closed] (ASTERIXDB-2331) Plan branch repeated

2018-03-14 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2331.
-
Resolution: Invalid

> Plan branch repeated
> 
>
> Key: ASTERIXDB-2331
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2331
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler
>Reporter: Wail Alkowaileet
>Assignee: Taewoo Kim
>Priority: Major
>
> I didn't investigate. But it looks like unmaintained Split output.
> DDL
> {noformat}
> DROP DATAVERSE SocialNetworkData IF EXISTS;
> CREATE DATAVERSE SocialNetworkData;
> USE SocialNetworkData;
> create type ChirpMessageType as {
> chirpid: int64,
> send_time: datetime
> };
> create type GleambookUserType as {
> id: int64,
> user_since: datetime
> };
> create type GleambookMessageType as {
> message_id: int64,
> author_id: int64,
> send_time: datetime
> };
> create dataset GleambookMessages(GleambookMessageType)
> primary key message_id;
> create dataset GleambookUsers(GleambookUserType)
> primary key id;
> create dataset ChirpMessages(ChirpMessageType)
> primary key chirpid;
> create index usrSinceIx on GleambookUsers(user_since);
> create index sndTimeIx on ChirpMessages(send_time);
> create index authorIdIx on GleambookMessages(author_id);
> {noformat}
> Query:
> {noformat}
> USE SocialNetworkData;
> EXPLAIN
> SELECT g.message_id
> FROM GleambookUsers as u, GleambookMessages as g
> WHERE u.id/*+indexnl*/ = g.author_id 
> AND u.user_since = datetime("2013-04-16T09:45:46")
> {noformat}
> Plan:
> {noformat}
> distribute result [$$28]
> -- DISTRIBUTE_RESULT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> union ($$54, $$55, $$28)
> -- UNION_ALL  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$54])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$54] <- [{"message_id": $$52}]
>   -- ASSIGN  |PARTITIONED|
> project ([$$52])
> -- STREAM_PROJECT  |PARTITIONED|
>   select (eq($$29, $$53.getField(1)))
>   -- STREAM_SELECT  |PARTITIONED|
> project ([$$29, $$52, $$53])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> unnest-map [$$52, $$53] <- 
> index-search("GleambookMessages", 0, "SocialNetworkData", 
> "GleambookMessages", TRUE, FALSE, 1, $$46, 1, $$46, TRUE, TRUE, TRUE)
> -- BTREE_SEARCH  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$29, $$46])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> split ($$47)
> -- SPLIT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$29, $$46, $$47])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> unnest-map [$$45, $$46, $$47] <- 
> index-search("authorIdIx", 0, "SocialNetworkData", "GleambookMessages", TRUE, 
> TRUE, 1, $$29, 1, $$29, TRUE, TRUE, TRUE)
> -- BTREE_SEARCH  |PARTITIONED|
>   exchange
>   -- BROADCAST_EXCHANGE  |PARTITIONED|
> union ($$43, $$38, $$29)
> -- UNION_ALL  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  
> |PARTITIONED|
> project ([$$43])
> -- STREAM_PROJECT  |PARTITIONED|
>   select (eq($$33, datetime: { 
> 2013-04-16T09:45:46.000Z }))
>   -- STREAM_SELECT  |PARTITIONED|
> project ([$$43, $$33])
> -- STREAM_PROJECT  
> |PARTITIONED|
>   assign [$$33] <- 
> [$$44.getField(1)]
>   -- ASSIGN  |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE  
> 

[jira] [Commented] (ASTERIXDB-2331) Plan branch repeated

2018-03-14 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399565#comment-16399565
 ] 

Taewoo Kim commented on ASTERIXDB-2331:
---

This query consists of two index-only plan sub-queries. What a complex query it 
is. :)

 

1) Outer branch:

SELECT u.id from GleambookUsers as u WHERE u.user_since = 
datetime("2013-04-16T09:45:46");

 

2) Inner branch:

SELECT g.message_id FROM GleambookMessages as g WHERE XXX /*+indexnl*/ = 
g.author_id;

 

As [~luochen01] confirmed, the outer branch shows two paths after SPLIT (the 
first one - left path to primary index look-up for those tuples that 
instantTryLock were failed during the secondary index-search, the second one - 
nothing is needed). And this plan is fed into the second index-only plan path. 
This is a correct plan. 

 

 

 

> Plan branch repeated
> 
>
> Key: ASTERIXDB-2331
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2331
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler
>Reporter: Wail Alkowaileet
>Assignee: Taewoo Kim
>Priority: Major
>
> I didn't investigate. But it looks like unmaintained Split output.
> DDL
> {noformat}
> DROP DATAVERSE SocialNetworkData IF EXISTS;
> CREATE DATAVERSE SocialNetworkData;
> USE SocialNetworkData;
> create type ChirpMessageType as {
> chirpid: int64,
> send_time: datetime
> };
> create type GleambookUserType as {
> id: int64,
> user_since: datetime
> };
> create type GleambookMessageType as {
> message_id: int64,
> author_id: int64,
> send_time: datetime
> };
> create dataset GleambookMessages(GleambookMessageType)
> primary key message_id;
> create dataset GleambookUsers(GleambookUserType)
> primary key id;
> create dataset ChirpMessages(ChirpMessageType)
> primary key chirpid;
> create index usrSinceIx on GleambookUsers(user_since);
> create index sndTimeIx on ChirpMessages(send_time);
> create index authorIdIx on GleambookMessages(author_id);
> {noformat}
> Query:
> {noformat}
> USE SocialNetworkData;
> EXPLAIN
> SELECT g.message_id
> FROM GleambookUsers as u, GleambookMessages as g
> WHERE u.id/*+indexnl*/ = g.author_id 
> AND u.user_since = datetime("2013-04-16T09:45:46")
> {noformat}
> Plan:
> {noformat}
> distribute result [$$28]
> -- DISTRIBUTE_RESULT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> union ($$54, $$55, $$28)
> -- UNION_ALL  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$54])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$54] <- [{"message_id": $$52}]
>   -- ASSIGN  |PARTITIONED|
> project ([$$52])
> -- STREAM_PROJECT  |PARTITIONED|
>   select (eq($$29, $$53.getField(1)))
>   -- STREAM_SELECT  |PARTITIONED|
> project ([$$29, $$52, $$53])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> unnest-map [$$52, $$53] <- 
> index-search("GleambookMessages", 0, "SocialNetworkData", 
> "GleambookMessages", TRUE, FALSE, 1, $$46, 1, $$46, TRUE, TRUE, TRUE)
> -- BTREE_SEARCH  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$29, $$46])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> split ($$47)
> -- SPLIT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> project ([$$29, $$46, $$47])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> unnest-map [$$45, $$46, $$47] <- 
> index-search("authorIdIx", 0, "SocialNetworkData", "GleambookMessages", TRUE, 
> TRUE, 1, $$29, 1, $$29, TRUE, TRUE, TRUE)
> -- BTREE_SEARCH  |PARTITIONED|
>   exchange
>   -- BROADCAST_EXCHANGE  |PARTITIONED|
> union ($$43, $$38, $$29)
> -- UNION_ALL  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  
> |PARTITIONED|
> project ([$$43])
> -- STREAM_PROJECT  

[jira] [Assigned] (ASTERIXDB-2306) Inverted list file is not deleted after an inverted index component merge operation

2018-03-13 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2306:
-

Assignee: Chen Luo  (was: Taewoo Kim)

> Inverted list file is not deleted after an inverted index component merge 
> operation
> ---
>
> Key: ASTERIXDB-2306
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2306
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: IDX - Indexes, STO - Storage
>Reporter: Taewoo Kim
>Assignee: Chen Luo
>Priority: Critical
>  Labels: triaged
>
> After the following exception, an inverted list file of an old component was 
> not deleted. This was happened during an ingestion of 1 Billion tweets on a 
> Cloudberry cluster with five nodes.
> {code:java}
> 23:15:15.269 [Executor-6082:5] WARN 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - Failure 
> scheduling replication or destroying merged component
> java.lang.IllegalStateException: Page 20629:2 is pinned and file is being 
> closed. Pincount is: 1 Page is confiscated: false
> at 
> org.apache.hyracks.storage.common.buffercache.BufferCache.invalidateIfFileIdMatch(BufferCache.java:930)
>  ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.common.buffercache.BufferCache.sweepAndFlush(BufferCache.java:904)
>  ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:997)
>  ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:983)
>  ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex.destroy(OnDiskInvertedIndex.java:170)
>  ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMDiskComponent.deactivateAndDestroy(AbstractLSMDiskComponent.java:166)
>  ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBloomFilterDiskComponent.deactivateAndDestroy(AbstractLSMWithBloomFilterDiskComponent.java:63)
>  ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBuddyDiskComponent.deactivateAndDestroy(AbstractLSMWithBuddyDiskComponent.java:56)
>  ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doExitComponents(LSMHarness.java:324)
>  [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.exitComponents(LSMHarness.java:411)
>  [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:639)
>  [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.merge(LSMInvertedIndexAccessor.java:125)
>  [hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:45)
>  [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:30)
>  [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0]
> at java.lang.Thread.run(Thread.java:744) [?:1.8.0]{code}
>  
> The directory:
> {code:java}
> -rw-r--r-- 1 waans11 waans11 262160 Feb 27 13:13 
> 2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_d
> -rw-r--r-- 1 waans11 waans11 0 Feb 27 13:12 
> 2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_f
> -rw-r--r-- 1 waans11 waans11 76288560 Feb 27 13:13 
> 2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_i
> -rw-r--r-- 1 waans11 waans11 262160 Feb 27 13:15 
> 2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_d
> -rw-r--r-- 1 waans11 waans11 0 Feb 27 13:15 
> 2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_f
> -rw-r--r-- 1 waans11 waans11 2097280 Feb 27 13:15 
> 2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_i
> -rw-r--r-- 1 waans11 waans11 43256400 Feb 27 23:15 
> 2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_b
> -rw-r--r-- 1 waans11 waans11 262160 

[jira] [Commented] (ASTERIXDB-2306) Inverted list file is not deleted after an inverted index component merge operation

2018-03-13 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397435#comment-16397435
 ] 

Taewoo Kim commented on ASTERIXDB-2306:
---

This is continuously happening and breaking the index structure.

 

09:46:39.018 [Executor-1340:1] WARN 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - Failure scheduling 
replication or destroying merged component
java.lang.IllegalStateException: Page 3263:3 is pinned and file is being 
closed. Pincount is: 1 Page is confiscated: false
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.invalidateIfFileIdMatch(BufferCache.java:930)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.sweepAndFlush(BufferCache.java:896)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:997)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:983)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex.destroy(OnDiskInvertedIndex.java:170)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMDiskComponent.deactivateAndDestroy(AbstractLSMDiskComponent.java:166)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBloomFilterDiskComponent.deactivateAndDestroy(AbstractLSMWithBloomFilterDiskComponent.java:63)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBuddyDiskComponent.deactivateAndDestroy(AbstractLSMWithBuddyDiskComponent.java:56)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doExitComponents(LSMHarness.java:324)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.exitComponents(LSMHarness.java:411)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:639)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.merge(LSMInvertedIndexAccessor.java:125)
 [hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:45)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:30)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0]
at java.lang.Thread.run(Thread.java:744) [?:1.8.0]
 
 
13:40:41.022 [Executor-1908:1] WARN 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - Failure scheduling 
replication or destroying merged component
java.lang.IllegalStateException: Page 3595:0 is pinned and file is being 
closed. Pincount is: 1 Page is confiscated: false
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.invalidateIfFileIdMatch(BufferCache.java:930)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.sweepAndFlush(BufferCache.java:904)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:997)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:983)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex.destroy(OnDiskInvertedIndex.java:170)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMDiskComponent.deactivateAndDestroy(AbstractLSMDiskComponent.java:166)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBloomFilterDiskComponent.deactivateAndDestroy(AbstractLSMWithBloomFilterDiskComponent.java:63)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 

[jira] [Commented] (ASTERIXDB-2327) An AsterixDB node cannot start with java.lang.StackOverflowError

2018-03-12 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396563#comment-16396563
 ] 

Taewoo Kim commented on ASTERIXDB-2327:
---

I was able to start the instance after deleting the transaction log.

> An AsterixDB node cannot start with java.lang.StackOverflowError
> 
>
> Key: ASTERIXDB-2327
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2327
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Murtadha Hubail
>Priority: Major
> Attachments: nc-4.log.gz
>
>
> On the Cloudberry cluster, after I shutdown the cluster and tried to bring it 
> again, one of server showed the following exception.
>  
> {code:java}
> 20:05:41.700 [Executor-6:4] WARN 
> org.apache.asterix.transaction.management.service.recovery.AbstractCheckpointManager
>  - Reading checkpoint file: /mnt/ssd/scrat
> ch/waans11/asterixdb/txnlog/checkpoint_1520910173665
> 20:05:41.790 [Executor-6:4] ERROR 
> org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage - 
> Failed during startup task
> java.lang.StackOverflowError: null
> at java.util.HashMap$EntryIterator.(HashMap.java:1461) ~[?:1.8.0]
> at java.util.HashMap$EntrySet.iterator(HashMap.java:1005) ~[?:1.8.0]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:556)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> 

[jira] [Updated] (ASTERIXDB-2327) An AsterixDB node cannot start with java.lang.StackOverflowError

2018-03-12 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2327:
--
Attachment: nc-4.log.gz

> An AsterixDB node cannot start with java.lang.StackOverflowError
> 
>
> Key: ASTERIXDB-2327
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2327
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Priority: Major
> Attachments: nc-4.log.gz
>
>
> On the Cloudberry cluster, after I shutdown the cluster and tried to bring it 
> again, one of server showed the following exception.
>  
> {code:java}
> 20:05:41.700 [Executor-6:4] WARN 
> org.apache.asterix.transaction.management.service.recovery.AbstractCheckpointManager
>  - Reading checkpoint file: /mnt/ssd/scrat
> ch/waans11/asterixdb/txnlog/checkpoint_1520910173665
> 20:05:41.790 [Executor-6:4] ERROR 
> org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage - 
> Failed during startup task
> java.lang.StackOverflowError: null
> at java.util.HashMap$EntryIterator.(HashMap.java:1461) ~[?:1.8.0]
> at java.util.HashMap$EntrySet.iterator(HashMap.java:1005) ~[?:1.8.0]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:556)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> .4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
> SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
> 

[jira] [Created] (ASTERIXDB-2327) An AsterixDB node cannot start with java.lang.StackOverflowError

2018-03-12 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2327:
-

 Summary: An AsterixDB node cannot start with 
java.lang.StackOverflowError
 Key: ASTERIXDB-2327
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2327
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


On the Cloudberry cluster, after I shutdown the cluster and tried to bring it 
again, one of server showed the following exception.

 
{code:java}
20:05:41.700 [Executor-6:4] WARN 
org.apache.asterix.transaction.management.service.recovery.AbstractCheckpointManager
 - Reading checkpoint file: /mnt/ssd/scrat
ch/waans11/asterixdb/txnlog/checkpoint_1520910173665
20:05:41.790 [Executor-6:4] ERROR 
org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage - 
Failed during startup task
java.lang.StackOverflowError: null
at java.util.HashMap$EntryIterator.(HashMap.java:1461) ~[?:1.8.0]
at java.util.HashMap$EntrySet.iterator(HashMap.java:1005) ~[?:1.8.0]
at 
org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:556)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.writeCurrentPartitionToDisk(RecoveryManager.java:929)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9
.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager$JobEntityCommits.spillToDiskAndfreeMemory(RecoveryManager.java:840)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-
SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.freeJobsCachedEntities(RecoveryManager.java:559)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.access$200(RecoveryManager.java:91) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 

[jira] [Commented] (ASTERIXDB-2311) After restart, "Failed to redo" exception was generated.

2018-03-05 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386979#comment-16386979
 ] 

Taewoo Kim commented on ASTERIXDB-2311:
---

After deleting the transaction log, the given node was able to be started.

> After restart, "Failed to redo" exception was generated.
> 
>
> Key: ASTERIXDB-2311
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2311
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Ian Maxon
>Priority: Major
> Attachments: nc-4.log
>
>
> During the realtime tweet ingestion process of Cloudberry, I found an issue 
> on the application that feeds a tweet to AsterixDB, I stopped that process. 
> Also, I stopped the feed itself and started the Cloudberry instance and saw 
> the following error message.
> {code:java}
> 21:58:33.129 [Executor-6:4] ERROR 
> org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage - 
> Failed during startup task
> java.lang.IllegalStateException: Failed to redo
> at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:784) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.startRecoveryRedoPhase(RecoveryManager.java:368)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.replayPartitionsLogs(RecoveryManager.java:178)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.RecoveryManager.startLocalRecovery(RecoveryManager.java:170)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:45)
>  ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage.handle(RegistrationTasksResponseMessage.java:62)
>  [asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.messaging.NCMessageBroker.lambda$receivedMessage$3(NCMessageBroker.java:100)
>  [asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.messaging.NCMessageBroker$$Lambda$70/727538728.run(Unknown 
> Source) [asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [?:1.8.0]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0]
> at java.lang.Thread.run(Thread.java:744) [?:1.8.0]
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: HYR0033: 
> Inserting duplicate keys into the primary storage
> at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:55)
>  ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.insert(LSMBTree.java:213)
>  ~[hyracks-storage-am-lsm-btree-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.modify(LSMBTree.java:164)
>  ~[hyracks-storage-am-lsm-btree-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:482)
>  ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.forceModify(LSMHarness.java:422)
>  ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.forceInsert(LSMTreeIndexAccessor.java:176)
>  ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
> at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:774) 
> ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> ... 12 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2311) After restart, "Failed to redo" exception was generated.

2018-03-04 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2311:
-

 Summary: After restart, "Failed to redo" exception was generated.
 Key: ASTERIXDB-2311
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2311
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim
 Attachments: nc-4.log

During the realtime tweet ingestion process of Cloudberry, I found an issue on 
the application that feeds a tweet to AsterixDB, I stopped that process. Also, 
I stopped the feed itself and started the Cloudberry instance and saw the 
following error message.
{code:java}
21:58:33.129 [Executor-6:4] ERROR 
org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage - 
Failed during startup task
java.lang.IllegalStateException: Failed to redo
at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:784) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.startRecoveryRedoPhase(RecoveryManager.java:368)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.replayPartitionsLogs(RecoveryManager.java:178)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.RecoveryManager.startLocalRecovery(RecoveryManager.java:170)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.nc.task.LocalRecoveryTask.perform(LocalRecoveryTask.java:45)
 ~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage.handle(RegistrationTasksResponseMessage.java:62)
 [asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.messaging.NCMessageBroker.lambda$receivedMessage$3(NCMessageBroker.java:100)
 [asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.messaging.NCMessageBroker$$Lambda$70/727538728.run(Unknown 
Source) [asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0]
at java.lang.Thread.run(Thread.java:744) [?:1.8.0]
Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: HYR0033: 
Inserting duplicate keys into the primary storage
at 
org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:55)
 ~[hyracks-api-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.insert(LSMBTree.java:213)
 ~[hyracks-storage-am-lsm-btree-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.modify(LSMBTree.java:164)
 ~[hyracks-storage-am-lsm-btree-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:482)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.forceModify(LSMHarness.java:422)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.forceInsert(LSMTreeIndexAccessor.java:176)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at org.apache.asterix.app.nc.RecoveryManager.redo(RecoveryManager.java:774) 
~[asterix-app-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
... 12 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ASTERIXDB-2306) Inverted list file is not deleted after an inverted index component merge operation

2018-02-28 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2306:
--
Description: 
After the following exception, an inverted list file of an old component was 
not deleted. This was happened during an ingestion of 1 Billion tweets on a 
Cloudberry cluster with five nodes.
{code:java}
23:15:15.269 [Executor-6082:5] WARN 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - Failure scheduling 
replication or destroying merged component
java.lang.IllegalStateException: Page 20629:2 is pinned and file is being 
closed. Pincount is: 1 Page is confiscated: false
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.invalidateIfFileIdMatch(BufferCache.java:930)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.sweepAndFlush(BufferCache.java:904)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:997)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:983)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex.destroy(OnDiskInvertedIndex.java:170)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMDiskComponent.deactivateAndDestroy(AbstractLSMDiskComponent.java:166)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBloomFilterDiskComponent.deactivateAndDestroy(AbstractLSMWithBloomFilterDiskComponent.java:63)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBuddyDiskComponent.deactivateAndDestroy(AbstractLSMWithBuddyDiskComponent.java:56)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doExitComponents(LSMHarness.java:324)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.exitComponents(LSMHarness.java:411)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:639)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.merge(LSMInvertedIndexAccessor.java:125)
 [hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:45)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:30)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0]
at java.lang.Thread.run(Thread.java:744) [?:1.8.0]{code}
 

The directory:
{code:java}
-rw-r--r-- 1 waans11 waans11 262160 Feb 27 13:13 
2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_d
-rw-r--r-- 1 waans11 waans11 0 Feb 27 13:12 
2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_f
-rw-r--r-- 1 waans11 waans11 76288560 Feb 27 13:13 
2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_i
-rw-r--r-- 1 waans11 waans11 262160 Feb 27 13:15 
2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_d
-rw-r--r-- 1 waans11 waans11 0 Feb 27 13:15 
2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_f
-rw-r--r-- 1 waans11 waans11 2097280 Feb 27 13:15 
2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_i
-rw-r--r-- 1 waans11 waans11 43256400 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_b
-rw-r--r-- 1 waans11 waans11 262160 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_d
-rw-r--r-- 1 waans11 waans11 0 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_f
-rw-r--r-- 1 waans11 waans11 148120400 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_i{code}
 

  was:
After the following exception, an inverted list file of an old component was 
not deleted.
{code:java}
23:15:15.269 [Executor-6082:5] WARN 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - Failure scheduling 
replication or destroying merged component
java.lang.IllegalStateException: Page 20629:2 is pinned and file is being 
closed. Pincount is: 1 Page is confiscated: false
at 

[jira] [Created] (ASTERIXDB-2306) Inverted list file is not deleted after an inverted index component merge operation

2018-02-28 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2306:
-

 Summary: Inverted list file is not deleted after an inverted index 
component merge operation
 Key: ASTERIXDB-2306
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2306
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


After the following exception, an inverted list file of an old component was 
not deleted.
{code:java}
23:15:15.269 [Executor-6082:5] WARN 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness - Failure scheduling 
replication or destroying merged component
java.lang.IllegalStateException: Page 20629:2 is pinned and file is being 
closed. Pincount is: 1 Page is confiscated: false
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.invalidateIfFileIdMatch(BufferCache.java:930)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.sweepAndFlush(BufferCache.java:904)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:997)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.common.buffercache.BufferCache.deleteFile(BufferCache.java:983)
 ~[hyracks-storage-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex.destroy(OnDiskInvertedIndex.java:170)
 ~[hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMDiskComponent.deactivateAndDestroy(AbstractLSMDiskComponent.java:166)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBloomFilterDiskComponent.deactivateAndDestroy(AbstractLSMWithBloomFilterDiskComponent.java:63)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.api.AbstractLSMWithBuddyDiskComponent.deactivateAndDestroy(AbstractLSMWithBuddyDiskComponent.java:56)
 ~[hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.doExitComponents(LSMHarness.java:324)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.exitComponents(LSMHarness.java:411)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:639)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.merge(LSMInvertedIndexAccessor.java:125)
 [hyracks-storage-am-lsm-invertedindex-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:45)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:30)
 [hyracks-storage-am-lsm-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0]
at java.lang.Thread.run(Thread.java:744) [?:1.8.0]{code}
 

The directory:
{code:java}
-rw-r--r-- 1 waans11 waans11 262160 Feb 27 13:13 
2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_d
-rw-r--r-- 1 waans11 waans11 0 Feb 27 13:12 
2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_f
-rw-r--r-- 1 waans11 waans11 76288560 Feb 27 13:13 
2018-02-27-13-12-52-923_2018-02-27-12-45-21-400_i
-rw-r--r-- 1 waans11 waans11 262160 Feb 27 13:15 
2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_d
-rw-r--r-- 1 waans11 waans11 0 Feb 27 13:15 
2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_f
-rw-r--r-- 1 waans11 waans11 2097280 Feb 27 13:15 
2018-02-27-13-15-21-671_2018-02-27-13-15-21-671_i
-rw-r--r-- 1 waans11 waans11 43256400 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_b
-rw-r--r-- 1 waans11 waans11 262160 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_d
-rw-r--r-- 1 waans11 waans11 0 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_f
-rw-r--r-- 1 waans11 waans11 148120400 Feb 27 23:15 
2018-02-27-23-14-41-965_2018-02-27-12-45-21-400_i{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-1956) An edit-distance-check query generates "Unable to find free page in buffer cache after 1000 cycles (buffer cache undersized?)" Exception

2018-02-23 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-1956.
-
Resolution: Fixed

> An edit-distance-check query generates "Unable to find free page in buffer 
> cache after 1000 cycles (buffer cache undersized?)" Exception
> 
>
> Key: ASTERIXDB-1956
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1956
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> Setting: 
> 1 node (CC+NC), 20GB dataset, NC: max memory 4GB, Buffer-cache size: default 
> (682.75 MB), a keyword index on the field.
> Query:
> {code}
> use dataverse exp;
> count(
> for $o in dataset
> "AmazonReviewNoDup"
> where edit-distance-check($o.reviewerName,
> "Jacob Libin"
> ,
> int64("2")
> )[0]
> return {"oid":$o.id}
> );
> {code}
> Result from API call:
> {code}
> {
>   "requestID": "0a36c521-879f-429c-af29-7a2670150585",
>   "signature": "*",
>   "errors": [{
>   "code": "1",
>   "msg": "Unable to find free page in buffer cache after 1000 cycles 
> (buffer cache undersized?)"
>   }],
>   "status": "fatal",
>   "metrics": {
>   "elapsedTime": "52.433274939s",
>   "executionTime": "52.431912415s",
>   "resultCount": 0,
>   "resultSize": 0
>   }
> }
> No success - status code: fatal
> {
>   "requestID": "0a36c521-879f-429c-af29-7a2670150585",
>   "signature": "*",
>   "errors": [{
>   "code": "1",
>   "msg": "Unable to find free page in buffer cache after 1000 cycles 
> (buffer cache undersized?)"
>   }],
>   "status": "fatal",
>   "metrics": {
>   "elapsedTime": "52.433274939s",
>   "executionTime": "52.431912415s",
>   "resultCount": 0,
>   "resultSize": 0
>   }
> }
> {code}
> Exception in the nc.log
> {code}
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Unable to 
> find free page in buffer cache after 1000 cycles (buffer cache undersized?)
>   at 
> org.apache.hyracks.storage.common.buffercache.BufferCache.getPageLoop(BufferCache.java:1261)
>   at 
> org.apache.hyracks.storage.common.buffercache.BufferCache.findPage(BufferCache.java:228)
>   at 
> org.apache.hyracks.storage.common.buffercache.BufferCache.pin(BufferCache.java:195)
>   at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.FixedSizeElementInvertedListCursor.pinPages(FixedSizeElementInvertedListCursor.java:98)
>   at 
> org.apache.hyracks.storage.am.lsm.invertedindex.search.PartitionedTOccurrenceSearcher.search(PartitionedTOccurrenceSearcher.java:150)
>   at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.search(OnDiskInvertedIndex.java:505)
>   at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexSearchCursor.hasNext(LSMInvertedIndexSearchCursor.java:153)
>   at 
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.writeSearchResults(IndexSearchOperatorNodePushable.java:183)
>   at 
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:236)
>   ... 12 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-1065) OOM during creating ngram index

2018-02-23 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-1065.
-
Resolution: Fixed

> OOM during creating ngram index
> ---
>
> Key: ASTERIXDB-1065
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1065
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: *DB - AsterixDB, STO - Storage
>Reporter: asterixdb-importer
>Assignee: Taewoo Kim
>Priority: Major
>
> OOM during creating ngram index



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-2296) AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree() cannot handle AUnionType

2018-02-23 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2296.
-
Resolution: Fixed

> AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree() cannot handle 
> AUnionType
> 
>
> Key: ASTERIXDB-2296
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2296
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> DDL
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
>     id: int64,
>     name: string,
>     screen_name : string,
>     profile_image_url : string?,
>     lang : string,
>     location: string,
>     create_at: date,
>     description: string,
>     followers_count: int32,
>     friends_count: int32,
>     statues_count: int64
> };
> create type typePlace if not exists as open{
>     country : string,
>     country_code : string,
>     full_name : string,
>     id : string,
>     name : string,
>     place_type : string,
>     bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
>     stateID: int32,
>     stateName: string,
>     countyID: int32,
>     countyName: string,
>     cityID: int32?,
>     cityName: string?
> };
> create type typeTweet if not exists as open {
>     create_at : datetime,
>     id: int64,
>     text: string,
>     in_reply_to_status : int64,
>     in_reply_to_user : int64,
>     favorite_count : int64,
>     coordinate: point?,
>     retweet_count : int64,
>     lang : string,
>     is_retweet: boolean,
>     hashtags : {{ string }} ?,
>     user_mentions : {{ int64 }} ? ,
>     user : typeUser,
>     place : typePlace?,
>     geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter 
> on create_at with
> {"merge-policy":
>   {"name":"prefix","parameters":
>     {"max-mergable-component-size":134217728, 
> "max-tolerance-component-count":10}
>   }
> };
> create index text_idx if not exists on ds_tweet(text) type fulltext;
> {code}
> The following SQL++ query doesn't work based on the above DDL. This is 
> because "place" is an optional field so that its type if AUnionType that 
> contains ARecordType, not ARecordType itself.
> {code:java}
> select t.`place`.`bounding_box` as `place.bounding_box`,t.`user`.`id` as 
> `user.id`,t.`id` as `id`,
> t.`coordinate` as `coordinate`,t.`create_at` as `create_at`
> from twitter.ds_tweet t
> where t.`create_at` >= datetime('2018-02-22T10:53:07.888Z') and t.`create_at` 
> < datetime('2018-02-22T18:50:39.301Z')
> and ftcontains(t.`text`, ['francisco'], {'mode':'all'}) and 
> t.`geo_tag`.`stateID` in [ 37,51,24,11 ]
> order by t.`create_at` desc
> limit 2147483647
> offset 0;
> {code}
> An exception:
> {code:java}
> 11:23:01.827 [QueryTranslator] INFO 
> org.apache.asterix.app.translator.QueryTranslator - 
> org.apache.asterix.om.types.AUnionType cannot be cast to 
> org.apache.asterix.om.types.ARecordType
> java.lang.ClassCastException: org.apache.asterix.om.types.AUnionType cannot 
> be cast to org.apache.asterix.om.types.ARecordType
> at 
> org.apache.asterix.optimizer.rules.am.AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree(AbstractIntroduceAccessMethodRule.java:877)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.AbstractIntroduceAccessMethodRule.fillFieldNamesInTheSubTree(AbstractIntroduceAccessMethodRule.java:973)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:412)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> 

[jira] [Assigned] (ASTERIXDB-2296) AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree() cannot handle AUnionType

2018-02-22 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2296:
-

Assignee: Taewoo Kim

> AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree() cannot handle 
> AUnionType
> 
>
> Key: ASTERIXDB-2296
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2296
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> DDL
> {code:java}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use twitter;
> create type typeUser if not exists as open {
>     id: int64,
>     name: string,
>     screen_name : string,
>     profile_image_url : string?,
>     lang : string,
>     location: string,
>     create_at: date,
>     description: string,
>     followers_count: int32,
>     friends_count: int32,
>     statues_count: int64
> };
> create type typePlace if not exists as open{
>     country : string,
>     country_code : string,
>     full_name : string,
>     id : string,
>     name : string,
>     place_type : string,
>     bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
>     stateID: int32,
>     stateName: string,
>     countyID: int32,
>     countyName: string,
>     cityID: int32?,
>     cityName: string?
> };
> create type typeTweet if not exists as open {
>     create_at : datetime,
>     id: int64,
>     text: string,
>     in_reply_to_status : int64,
>     in_reply_to_user : int64,
>     favorite_count : int64,
>     coordinate: point?,
>     retweet_count : int64,
>     lang : string,
>     is_retweet: boolean,
>     hashtags : {{ string }} ?,
>     user_mentions : {{ int64 }} ? ,
>     user : typeUser,
>     place : typePlace?,
>     geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter 
> on create_at with
> {"merge-policy":
>   {"name":"prefix","parameters":
>     {"max-mergable-component-size":134217728, 
> "max-tolerance-component-count":10}
>   }
> };
> create index text_idx if not exists on ds_tweet(text) type fulltext;
> {code}
> The following SQL++ query doesn't work based on the above DDL. This is 
> because "place" is an optional field so that its type if AUnionType that 
> contains ARecordType, not ARecordType itself.
> {code:java}
> select t.`place`.`bounding_box` as `place.bounding_box`,t.`user`.`id` as 
> `user.id`,t.`id` as `id`,
> t.`coordinate` as `coordinate`,t.`create_at` as `create_at`
> from twitter.ds_tweet t
> where t.`create_at` >= datetime('2018-02-22T10:53:07.888Z') and t.`create_at` 
> < datetime('2018-02-22T18:50:39.301Z')
> and ftcontains(t.`text`, ['francisco'], {'mode':'all'}) and 
> t.`geo_tag`.`stateID` in [ 37,51,24,11 ]
> order by t.`create_at` desc
> limit 2147483647
> offset 0;
> {code}
> An exception:
> {code:java}
> 11:23:01.827 [QueryTranslator] INFO 
> org.apache.asterix.app.translator.QueryTranslator - 
> org.apache.asterix.om.types.AUnionType cannot be cast to 
> org.apache.asterix.om.types.ARecordType
> java.lang.ClassCastException: org.apache.asterix.om.types.AUnionType cannot 
> be cast to org.apache.asterix.om.types.ARecordType
> at 
> org.apache.asterix.optimizer.rules.am.AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree(AbstractIntroduceAccessMethodRule.java:877)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.AbstractIntroduceAccessMethodRule.fillFieldNamesInTheSubTree(AbstractIntroduceAccessMethodRule.java:973)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:412)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
>  ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> at 
> 

[jira] [Created] (ASTERIXDB-2296) AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree() cannot handle AUnionType

2018-02-22 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2296:
-

 Summary: 
AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree() cannot handle 
AUnionType
 Key: ASTERIXDB-2296
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2296
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


DDL
{code:java}
drop dataverse twitter if exists;
create dataverse twitter if not exists;
use twitter;

create type typeUser if not exists as open {
    id: int64,
    name: string,
    screen_name : string,
    profile_image_url : string?,
    lang : string,
    location: string,
    create_at: date,
    description: string,
    followers_count: int32,
    friends_count: int32,
    statues_count: int64
};

create type typePlace if not exists as open{
    country : string,
    country_code : string,
    full_name : string,
    id : string,
    name : string,
    place_type : string,
    bounding_box : rectangle
};

create type typeGeoTag if not exists as open {
    stateID: int32,
    stateName: string,
    countyID: int32,
    countyName: string,
    cityID: int32?,
    cityName: string?
};

create type typeTweet if not exists as open {
    create_at : datetime,
    id: int64,
    text: string,
    in_reply_to_status : int64,
    in_reply_to_user : int64,
    favorite_count : int64,
    coordinate: point?,
    retweet_count : int64,
    lang : string,
    is_retweet: boolean,
    hashtags : {{ string }} ?,
    user_mentions : {{ int64 }} ? ,
    user : typeUser,
    place : typePlace?,
    geo_tag: typeGeoTag
};

create dataset ds_tweet(typeTweet) if not exists primary key id with filter on 
create_at with
{"merge-policy":
  {"name":"prefix","parameters":
    {"max-mergable-component-size":134217728, 
"max-tolerance-component-count":10}
  }
};

create index text_idx if not exists on ds_tweet(text) type fulltext;
{code}
The following SQL++ query doesn't work based on the above DDL. This is because 
"place" is an optional field so that its type if AUnionType that contains 
ARecordType, not ARecordType itself.
{code:java}
select t.`place`.`bounding_box` as `place.bounding_box`,t.`user`.`id` as 
`user.id`,t.`id` as `id`,
t.`coordinate` as `coordinate`,t.`create_at` as `create_at`
from twitter.ds_tweet t
where t.`create_at` >= datetime('2018-02-22T10:53:07.888Z') and t.`create_at` < 
datetime('2018-02-22T18:50:39.301Z')
and ftcontains(t.`text`, ['francisco'], {'mode':'all'}) and 
t.`geo_tag`.`stateID` in [ 37,51,24,11 ]
order by t.`create_at` desc
limit 2147483647
offset 0;
{code}
An exception:
{code:java}
11:23:01.827 [QueryTranslator] INFO 
org.apache.asterix.app.translator.QueryTranslator - 
org.apache.asterix.om.types.AUnionType cannot be cast to 
org.apache.asterix.om.types.ARecordType
java.lang.ClassCastException: org.apache.asterix.om.types.AUnionType cannot be 
cast to org.apache.asterix.om.types.ARecordType
at 
org.apache.asterix.optimizer.rules.am.AbstractIntroduceAccessMethodRule.getFieldNameFromSubTree(AbstractIntroduceAccessMethodRule.java:877)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.AbstractIntroduceAccessMethodRule.fillFieldNamesInTheSubTree(AbstractIntroduceAccessMethodRule.java:973)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:412)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
 ~[asterix-algebra-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
at 
org.apache.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule.checkAndApplyTheSelectTransformation(IntroduceSelectAccessMethodRule.java:327)
 

[jira] [Closed] (ASTERIXDB-2083) An inverted index-search generates OOM Exception.

2018-02-19 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2083.
-
Resolution: Fixed

> An inverted index-search generates OOM Exception.
> -
>
> Key: ASTERIXDB-2083
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2083
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> An inverted index search can generate OOM Exception if the index size is 
> large. This can apply to any inverted-index search related functions such as 
> *ftcontains* and *contains*.
> An example exception message is as follows. We can see that this happens 
> during an inverted-index search. 
> {code}
> Aug 15, 2017 6:58:06 AM 
> org.apache.hyracks.api.lifecycle.LifeCycleComponentManager uncaughtException
> SEVERE: Uncaught Exception from thread Executor-9:1
> java.lang.OutOfMemoryError: Java heap space
> Aug 15, 2017 6:58:06 AM 
> org.apache.hyracks.api.lifecycle.LifeCycleComponentManager stopAll
> INFO: Attempting to stop 
> org.apache.hyracks.api.lifecycle.LifeCycleComponentManager@69a3d1d
> Aug 15, 2017 6:58:06 AM 
> org.apache.hyracks.api.lifecycle.LifeCycleComponentManager stopAll
> SEVERE: Stopping instance
> Aug 15, 2017 6:58:06 AM 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: AbortTasks
> Aug 15, 2017 6:58:06 AM org.apache.hyracks.control.nc.Task run
> WARNING: Task TAID:TID:ANID:ODID:4:0:0:0 failed with exception
> org.apache.hyracks.api.exceptions.HyracksDataException: 
> java.lang.OutOfMemoryError: Java heap space
> at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:45)
> at 
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:220)
> at 
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:86)
> at org.apache.hyracks.control.nc.Task.run(Task.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.createOpContext(BTree.java:753)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.access$100(BTree.java:67)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree$BTreeAccessor.(BTree.java:844)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.createAccessor(BTree.java:820)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndexOpContext.(OnDiskInvertedIndexOpContext.java:42)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.(OnDiskInvertedIndex.java:422)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex.createAccessor(OnDiskInvertedIndex.java:491)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndex.search(LSMInvertedIndex.java:275)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.search(LSMHarness.java:445)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.search(LSMInvertedIndexAccessor.java:77)
> at 
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:193)
> at 
> org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.write(AbstractFrameAppender.java:92)
> at 
> org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushAndReset(AbstractOneInputOneOutputOneFramePushRuntime.java:66)
> at 
> org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushIfNotFailed(AbstractOneInputOneOutputOneFramePushRuntime.java:72)
> at 
> org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:55)
> at 
> org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.close(AssignRuntimeFactory.java:119)
> at 
> org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:65)
> at 
> org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:104)
> at 
> 

[jira] [Assigned] (ASTERIXDB-2290) Feed doesn't work when the address type is NC.

2018-02-17 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2290:
-

Assignee: Xikui Wang

> Feed doesn't work when the address type is NC.
> --
>
> Key: ASTERIXDB-2290
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2290
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Xikui Wang
>Priority: Major
>
> A feed works fine when the given address type is IP. The following feed can 
> be created and works without any issue.
> {code:java}
> create feed TweetFeed with {
> "adapter-name" : "socket_adapter",
> "sockets" : "x.x.x.x:10001",
> "address-type" : "IP",
> "type-name" : "typeTweet",
> "format" : "adm",
> "insert-feed" : "true"
> };{code}
>  
> The same feed doesn't work when the given address type is NC. The following 
> feed can be created. However, nothing happens when trying to ingest a record.
> {code:java}
> create feed TweetFeed with {
> "adapter-name" : "socket_adapter",
> "sockets" : "1:10001",
> "address-type" : "NC",
> "type-name" : "typeTweet",
> "format" : "adm",
> "insert-feed" : "true"
> };{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-1984) Index-Nested-Loop Join should not care about the contents of the probe branch

2018-02-17 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-1984.
-
Resolution: Fixed

> Index-Nested-Loop Join should not care about the contents of the probe branch
> -
>
> Key: ASTERIXDB-1984
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1984
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> Currently, when the optimizer tries to transform a query with the "index_nl" 
> hint, it tries to identify datasource, assign, and unnest operators in both 
> outer (probe) and inner branch. It also tries to identify the fieldname of 
> the variables that are being joined. However, the probe branch can be an 
> arbitrary sub-plan and what only really matters for the probe subtree is the 
> type of the field from the probe tree that is being joined. If that field 
> type is correctly identified, then any form of probe-subtree with a simple 
> data-scan on the inner branch can be correctly transformed into an 
> index-utilization plan. The following queries should be transformed into an 
> index-utilization plan. However, they are not transformed now in the current 
> master.
> E.g.,
> {code}
> SELECT * FROM
> [1, 2, 3] AS bar JOIN foo on bar /*+ indexnl */ =  foo.key;
> SELECT  * FROM
> bar JOIN foo on bar.id = foo.key JOIN datac ON foo.key /*+ indexnl */ = 
> datac.val;
> SELECT  * FROM
> (SELECT id, COUNT(*) FROM bar GROUP BY id) AS barr JOIN foo ON barr.id /*+ 
> indexnl */ =  foo.key;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (ASTERIXDB-1972) Index-only plan

2018-02-17 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-1972.
-
Resolution: Implemented

> Index-only plan
> ---
>
> Key: ASTERIXDB-1972
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1972
> Project: Apache AsterixDB
>  Issue Type: Improvement
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>Priority: Major
>
> It would be nice to have an index-only plan feature where a secondary-index 
> search alone can generate and return the final result without traversing the 
> primary index and applying the final verification using SELECT operator. The 
> reason that we need to traverse the primary index (+ verification using a 
> SELECT operator) is that we don't do any locking during the secondary-index 
> search. Therefore, the secondary-index search is not authoritative. In order 
> to give such authority, the proposed solution is that apply an instantTryLock 
> on each PK found during the secondary-index search. This guarantees that we 
> see the final result. For more details, refer to the following google doc. 
> https://docs.google.com/presentation/d/1Hj4ONmBWf5vf0EJrueygcoJirru2vy6eaCzmIiT0ypo/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2290) Feed doesn't work when the address type is NC.

2018-02-17 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2290:
-

 Summary: Feed doesn't work when the address type is NC.
 Key: ASTERIXDB-2290
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2290
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


A feed works fine when the given address type is IP. The following feed can be 
created and works without any issue.
{code:java}
create feed TweetFeed with {
"adapter-name" : "socket_adapter",
"sockets" : "x.x.x.x:10001",
"address-type" : "IP",
"type-name" : "typeTweet",
"format" : "adm",
"insert-feed" : "true"
};{code}
 

The same feed doesn't work when the given address type is NC. The following 
feed can be created. However, nothing happens when trying to ingest a record.
{code:java}
create feed TweetFeed with {
"adapter-name" : "socket_adapter",
"sockets" : "1:10001",
"address-type" : "NC",
"type-name" : "typeTweet",
"format" : "adm",
"insert-feed" : "true"
};{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2280) RTree on an optional nested field can't be built.

2018-02-06 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2280:
-

 Summary: RTree on an optional nested field can't be built.
 Key: ASTERIXDB-2280
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2280
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


If there is an optional nested field, we can't build an RTree index.

 
{code:java}
use twitter;

create type typePlace if not exists as open{
country : string,
country_code : string,
full_name : string,
id : string,
name : string,
place_type : string,
bounding_box : rectangle
};

create type typeTweet2 if not exists as open {
create_at : datetime,
id: int64,
text: string,
in_reply_to_status : int64,
in_reply_to_user : int64,
favorite_count : int64,
coordinate: point?,
retweet_count : int64,
lang : string,
is_retweet: boolean,
hashtags : {{ string }} ?,
user_mentions : {{ int64 }} ? ,
place : typePlace?
};

create dataset ds_test(typeTweet2) primary key id with filter on create_at;

// success
CREATE INDEX dsTwIphoneIdx ON ds_test(create_at) TYPE BTREE;

// success
CREATE INDEX dsTwIphoneIdxCo ON ds_test(coordinate) TYPE RTREE;

// fail
CREATE INDEX dsTwIphoneIdxBBox ON ds_test(place.bounding_box) TYPE RTREE;

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2274) Creating an index statement is executed even "Execute query" option was unchecked.

2018-01-31 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2274:
-

 Summary: Creating an index statement is executed even "Execute 
query" option was unchecked.
 Key: ASTERIXDB-2274
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2274
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


Creating an index statement is executed for the following situation on the 
current WebUI.

 

Only check "Print optimized logical plan". 

Uncheck "Execute query".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2259) Having an actual field value on a secondary index

2018-01-22 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2259:
-

 Summary: Having an actual field value on a secondary index
 Key: ASTERIXDB-2259
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2259
 Project: Apache AsterixDB
  Issue Type: Improvement
Reporter: Taewoo Kim


Currently, we don't keep the original field value on a secondary index in case 
a type-casting happened. (e.g., inserting 67.9 to an INT index). Having an 
actual-field value will make the index-only plan possible on such cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ASTERIXDB-2258) Need to document the type-casting operation

2018-01-22 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2258:
-

 Summary: Need to document the type-casting operation
 Key: ASTERIXDB-2258
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2258
 Project: Apache AsterixDB
  Issue Type: Improvement
Reporter: Taewoo Kim


We need to document the type-casting operation in AsterixDB. Currently, we have 
documented some in the functions part. However, no clear explanations were made 
what happens non-enforced-index/enforced-index/index on a closed-type-field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2235) Normalization exception during a sort

2018-01-16 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327942#comment-16327942
 ] 

Taewoo Kim commented on ASTERIXDB-2235:
---

Maybe there are two issues? The following is the complete log records. However, 
at that time, CC was running fine. 

 

 

Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.control.nc.work.NotifyTaskFailureWork run
WARNING: 4 is sending a notification to cc that task TAID:TID:ANID:ODID:2:0:7:0 
has failed
org.apache.hyracks.api.exceptions.HyracksDataException: 
java.lang.IllegalStateException: Corrupted string bytes: trying to access entry 
318767187 in a byte array of length 32768
 at 
org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:48)
 at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:418)
 at org.apache.hyracks.control.nc.Task.run(Task.java:323)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException: Corrupted string bytes: trying to 
access entry 318767187 in a byte array of length 32768
 at 
org.apache.hyracks.util.encoding.VarLenIntEncoderDecoder.decode(VarLenIntEncoderDecoder.java:82)
 at 
org.apache.hyracks.data.std.primitive.ByteArrayPointable.getContentLength(ByteArrayPointable.java:154)
 at 
org.apache.hyracks.data.std.primitive.ByteArrayPointable.normalize(ByteArrayPointable.java:174)
 at 
org.apache.hyracks.dataflow.common.data.normalizers.ByteArrayNormalizedKeyComputerFactory$1.normalize(ByteArrayNormalizedKeyComputerFactory.java:34)
 at 
org.apache.asterix.dataflow.data.nontagged.keynormalizers.AWrappedAscNormalizedKeyComputerFactory$1.normalize(AWrappedAscNormalizedKeyComputerFactory.java:46)
 at 
org.apache.hyracks.api.dataflow.value.INormalizedKeyComputer.normalize(INormalizedKeyComputer.java:25)
 at 
org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.sort(AbstractFrameSorter.java:193)
 at 
org.apache.hyracks.dataflow.std.sort.AbstractSortRunGenerator.close(AbstractSortRunGenerator.java:48)
 at 
org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1.close(AbstractSorterOperatorDescriptor.java:132)
 at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:409)
 ... 4 more

Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:6:0
Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1
 close
INFO: InitialNumberOfRuns:0
Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:2:0:6:0
Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1
 close
INFO: InitialNumberOfRuns:0
Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:2:0:7:0
Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1
 close
INFO: InitialNumberOfRuns:0
Jan 09, 2018 5:09:28 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:2:0:6:0
Jan 09, 2018 5:09:32 PM 
org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread doRun
SEVERE: Exception processing message; sleeping 1 seconds
java.io.IOException: Invalid argument
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
 at 
org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:282)
 at 
org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.run(IPCConnectionManager.java:195)

Jan 09, 2018 5:09:33 PM 
org.apache.hyracks.control.common.ipc.ControllerRemoteProxy ensureIpcHandle
WARNING: ipcHandle IPCHandle [addr=/128.195.52.77:1099 state=CLOSED] 
disconnected; retrying connection
Jan 09, 2018 5:09:33 PM 
org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread doRun
SEVERE: Exception processing message; sleeping 1 seconds
java.net.SocketException: Network is unreachable
 at sun.nio.ch.Net.connect0(Native Method)
 at sun.nio.ch.Net.connect(Net.java:435)
 at sun.nio.ch.Net.connect(Net.java:427)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:643)
 at 
org.apache.hyracks.ipc.impl.IPCConnectionManager$NetworkThread.doRun(IPCConnectionManager.java:224)
 at 

[jira] [Commented] (ASTERIXDB-2235) Normalization exception during a sort

2018-01-10 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320635#comment-16320635
 ] 

Taewoo Kim commented on ASTERIXDB-2235:
---

[~tillw] I am trying to find an exact query that caused this. Right now, I just 
read the first part of the exception.

{code}
org.apache.asterix.dataflow.data.nontagged.keynormalizers.AWrappedAscNormalizedKeyComputerFactory$1.normalize(AWrappedAscNormalizedKeyComputerFactory.java:46)
at 
org.apache.hyracks.api.dataflow.value.INormalizedKeyComputer.normalize(INormalizedKeyComputer.java:25)
at 
org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.sort(AbstractFrameSorter.java:193)
{code}

> Normalization exception during a sort
> -
>
> Key: ASTERIXDB-2235
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2235
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>
> A Twittermap query generates that exception during an execution. And the node 
> becomes unavailable.
> {code}
> Jan 09, 2018 5:09:28 PM org.apache.hyracks.control.nc.Task run
> WARNING: Task TAID:TID:ANID:ODID:2:0:7:0 failed with exception
> org.apache.hyracks.api.exceptions.HyracksDataException: 
> java.lang.IllegalStateException: Corrupted string bytes: trying to access 
> entry 318767187 in a byte array of length 32768
>   at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:48)
>   at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:418)
>   at org.apache.hyracks.control.nc.Task.run(Task.java:323)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IllegalStateException: Corrupted string bytes: trying to 
> access entry 318767187 in a byte array of length 32768
>   at 
> org.apache.hyracks.util.encoding.VarLenIntEncoderDecoder.decode(VarLenIntEncoderDecoder.java:82)
>   at 
> org.apache.hyracks.data.std.primitive.ByteArrayPointable.getContentLength(ByteArrayPointable.java:154)
>   at 
> org.apache.hyracks.data.std.primitive.ByteArrayPointable.normalize(ByteArrayPointable.java:174)
>   at 
> org.apache.hyracks.dataflow.common.data.normalizers.ByteArrayNormalizedKeyComputerFactory$1.normalize(ByteArrayNormalizedKeyComputerFactory.java:34)
>   at 
> org.apache.asterix.dataflow.data.nontagged.keynormalizers.AWrappedAscNormalizedKeyComputerFactory$1.normalize(AWrappedAscNormalizedKeyComputerFactory.java:46)
>   at 
> org.apache.hyracks.api.dataflow.value.INormalizedKeyComputer.normalize(INormalizedKeyComputer.java:25)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.sort(AbstractFrameSorter.java:193)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractSortRunGenerator.close(AbstractSortRunGenerator.java:48)
>   at 
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1.close(AbstractSorterOperatorDescriptor.java:132)
>   at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:409)
>   ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ASTERIXDB-2235) Normalization exception during a sort

2018-01-09 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2235:
-

 Summary: Normalization exception during a sort
 Key: ASTERIXDB-2235
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2235
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


A Twittermap query generates that exception during an execution. And the node 
becomes unavailable.

{code}
Jan 09, 2018 5:09:28 PM org.apache.hyracks.control.nc.Task run
WARNING: Task TAID:TID:ANID:ODID:2:0:7:0 failed with exception
org.apache.hyracks.api.exceptions.HyracksDataException: 
java.lang.IllegalStateException: Corrupted string bytes: trying to access entry 
318767187 in a byte array of length 32768
at 
org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:48)
at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:418)
at org.apache.hyracks.control.nc.Task.run(Task.java:323)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException: Corrupted string bytes: trying to 
access entry 318767187 in a byte array of length 32768
at 
org.apache.hyracks.util.encoding.VarLenIntEncoderDecoder.decode(VarLenIntEncoderDecoder.java:82)
at 
org.apache.hyracks.data.std.primitive.ByteArrayPointable.getContentLength(ByteArrayPointable.java:154)
at 
org.apache.hyracks.data.std.primitive.ByteArrayPointable.normalize(ByteArrayPointable.java:174)
at 
org.apache.hyracks.dataflow.common.data.normalizers.ByteArrayNormalizedKeyComputerFactory$1.normalize(ByteArrayNormalizedKeyComputerFactory.java:34)
at 
org.apache.asterix.dataflow.data.nontagged.keynormalizers.AWrappedAscNormalizedKeyComputerFactory$1.normalize(AWrappedAscNormalizedKeyComputerFactory.java:46)
at 
org.apache.hyracks.api.dataflow.value.INormalizedKeyComputer.normalize(INormalizedKeyComputer.java:25)
at 
org.apache.hyracks.dataflow.std.sort.AbstractFrameSorter.sort(AbstractFrameSorter.java:193)
at 
org.apache.hyracks.dataflow.std.sort.AbstractSortRunGenerator.close(AbstractSortRunGenerator.java:48)
at 
org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1.close(AbstractSorterOperatorDescriptor.java:132)
at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:409)
... 4 more
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ASTERIXDB-2218) Enforcement of a secondary index does not work.

2018-01-02 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2218:
--
Description: 
The enforced index does not check the field type when inserting a record. The 
following code work on the current master.

{code}
create type tempType if not exists as open {
id: int64
};

create dataset tempDataset(tempType) primary key id;

create index tempIndex on tempDataset(val:int64?) enforced;

insert into tempDataset({"id":1,"val":64.79});
{code}

For a closed-type field, it works as well.

{code}
create type tempClosedType if not exists as closed {
id: int64,
val: int64
};

create dataset tempClosedDataset(tempClosedType) primary key id;

create index tempClosedIndex on tempClosedDataset(val);

insert into tempClosedDataset({"id":1,"val":64.79});
{code}

  was:
The enforced index does not check the field type when inserting a record. The 
following code work on the current master.

{code}
create type tempType if not exists as open {
id: int64
};

create dataset tempDataset(tempType) primary key id;

create index tempIndex on tempDataset(val:int64?) enforced;

insert into tempDataset({"id":1,"val":64.79});
{code}


> Enforcement of a secondary index does not work.
> ---
>
> Key: ASTERIXDB-2218
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2218
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>
> The enforced index does not check the field type when inserting a record. The 
> following code work on the current master.
> {code}
> create type tempType if not exists as open {
> id: int64
> };
> create dataset tempDataset(tempType) primary key id;
> create index tempIndex on tempDataset(val:int64?) enforced;
> insert into tempDataset({"id":1,"val":64.79});
> {code}
> For a closed-type field, it works as well.
> {code}
> create type tempClosedType if not exists as closed {
> id: int64,
> val: int64
> };
> create dataset tempClosedDataset(tempClosedType) primary key id;
> create index tempClosedIndex on tempClosedDataset(val);
> insert into tempClosedDataset({"id":1,"val":64.79});
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ASTERIXDB-2218) Enforcement of a secondary index does not work.

2018-01-02 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2218:
-

 Summary: Enforcement of a secondary index does not work.
 Key: ASTERIXDB-2218
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2218
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


The enforced index does not check the field type when inserting a record. The 
following code work on the current master.

{code}
create type tempType if not exists as open {
id: int64
};

create dataset tempDataset(tempType) primary key id;

create index tempIndex on tempDataset(val:int64?) enforced;

insert into tempDataset({"id":1,"val":64.79});
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (ASTERIXDB-2153) Fulltext does not handle the search option properly

2017-12-30 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-2153.
-
Resolution: Fixed

> Fulltext does not handle the search option properly
> ---
>
> Key: ASTERIXDB-2153
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2153
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Taewoo Kim
>
> Fulltext search does not handle the search option (conjunctive - AND or 
> disjuctive - OR) properly when a WHERE predicate contains multiple conditions 
> like the following case. It always conducts a disjunctive (OR) search even 
> though the option tells to do "AND" search. 
> {code}
> select t.`text` from twitter.ds_tweet t
> where t.`create_at` >= datetime('2017-10-10T16:48:28.980Z') and t.`create_at` 
> < datetime('2017-10-10T17:48:28.980Z') and ftcontains(t.`text`, 
> ['house','of','cards'], {'mode':'all'});
> {code}
> {code}
> select t.`text` from twitter.ds_tweet t
> where t.`create_at` >= datetime('2017-10-10T16:48:28.980Z') and t.`create_at` 
> < datetime('2017-10-10T17:48:28.980Z') and ftcontains(t.`text`, 
> ['house','of','cards']);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ASTERIXDB-2215) Filter is not properly applied for a secondary inverted index search

2017-12-29 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim updated ASTERIXDB-2215:
--
Description: 
Based on the way of writing predicate conditions on a field with filter, the 
generated plan does not correctly show min and max value of a filter.

{code}
drop dataverse twitter if exists;
create dataverse twitter if not exists;
use dataverse twitter;

create type typeUser if not exists as open {
id: int64,
name: string,
screen_name : string,
profile_image_url : string,
lang : string,
location: string,
create_at: date,
description: string,
followers_count: int32,
friends_count: int32,
statues_count: int64
};

create type typePlace if not exists as open{
country : string,
country_code : string,
full_name : string,
id : string,
name : string,
place_type : string,
bounding_box : rectangle
};

create type typeGeoTag if not exists as open {
stateID: int32,
stateName: string,
countyID: int32,
countyName: string,
cityID: int32?,
cityName: string?
};

create type typeTweet if not exists as open {
create_at : datetime,
id: int64,
"text": string,
in_reply_to_status : int64,
in_reply_to_user : int64,
favorite_count : int64,
coordinate: point?,
retweet_count : int64,
lang : string,
is_retweet: boolean,
hashtags : {{ string }} ?,
user_mentions : {{ int64 }} ? ,
user : typeUser,
place : typePlace?,
geo_tag: typeGeoTag
};

create dataset ds_tweet(typeTweet) if not exists primary key id with filter on 
create_at;
{code}

For the following query, the logical plan shows empty min[] and two variables 
in max[] when doing an inverted-index search. 

{code}
USE twitter;
SELECT spatial_cell(get_points(place.bounding_box)[0], 
create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet
WHERE ftcontains(text, ['trump'], {'mode':'any'}) AND place.bounding_box IS NOT 
unknown 
AND datetime('2017-02-25T00:00:00') <= create_at AND  create_at < 
datetime('2017-02-26T00:00:00')
GROUP BY cell;
{code}

Exact predicates on the filter
{code}
datetime('2017-02-25T00:00:00') <= create_at AND  create_at < 
datetime('2017-02-26T00:00:00')
{code}

{code}
unnest-map [$$64, $$69, $$70] <- index-search("text_idx", 2, "twitter", 
"ds_tweet", FALSE, FALSE, 5, null, 21, TRUE, 1, $$63) with filter on min:[] 
max:[$$67, $$68]
-- 
SINGLE_PARTITION_INVERTED_INDEX_SEARCH  |PARTITIONED|
  exchange
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
assign [$$67, $$68, $$63] <- 
[datetime: { 2017-02-26T00:00:00.000Z }, datetime: { 2017-02-25T00:00:00.000Z 
}, array: [ "trump" ]]
-- ASSIGN  |PARTITIONED|
  empty-tuple-source
  -- EMPTY_TUPLE_SOURCE  
|PARTITIONED|
{code}


However, for the following query(just switched the location of datetime and 
create_at at the end of the predicates), it shows another incorrect plan.

{code}
SELECT spatial_cell(get_points(place.bounding_box)[0], 
create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet
WHERE ftcontains(text, ['trump'], {'mode':'any'}) AND place.bounding_box IS NOT 
unknown 
AND datetime('2017-02-25T00:00:00') <= create_at AND  
datetime('2017-02-26T00:00:00') > create_at
GROUP BY cell;
{code}

Exact predicates on the filter:
{code}
datetime('2017-02-25T00:00:00') <= create_at AND  
datetime('2017-02-26T00:00:00') > create_at
{code}

{code}
unnest-map [$$64, $$69, $$70] <- index-search("text_idx", 2, "twitter", 
"ds_tweet", FALSE, FALSE, 5, null, 21, TRUE, 1, $$63) with filter on min:[$$67] 
max:[$$68]
-- 
SINGLE_PARTITION_INVERTED_INDEX_SEARCH  |PARTITIONED|
  exchange
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
assign [$$67, $$68, $$63] <- 
[datetime: { 2017-02-26T00:00:00.000Z }, datetime: { 2017-02-25T00:00:00.000Z 
}, array: [ "trump" ]]
-- ASSIGN  |PARTITIONED|
  empty-tuple-source
  -- EMPTY_TUPLE_SOURCE  
|PARTITIONED|
{code}

  was:
Based on the way of writing predicate conditions on a field with filter, the 
generated plan sometimes is correct and sometimes not.

{code}
drop dataverse twitter if exists;
create dataverse twitter if not exists;
use dataverse twitter;

create type typeUser if not exists as open {
id: int64,
name: string,
screen_name : string,
profile_image_url : string,
lang : string,
location: 

[jira] [Created] (ASTERIXDB-2215) Filter is not properly applied for a secondary inverted index search

2017-12-28 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2215:
-

 Summary: Filter is not properly applied for a secondary inverted 
index search
 Key: ASTERIXDB-2215
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2215
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


Based on the way of writing predicate conditions on a field with filter, the 
generated plan sometimes is correct and sometimes not.

{code}
drop dataverse twitter if exists;
create dataverse twitter if not exists;
use dataverse twitter;

create type typeUser if not exists as open {
id: int64,
name: string,
screen_name : string,
profile_image_url : string,
lang : string,
location: string,
create_at: date,
description: string,
followers_count: int32,
friends_count: int32,
statues_count: int64
};

create type typePlace if not exists as open{
country : string,
country_code : string,
full_name : string,
id : string,
name : string,
place_type : string,
bounding_box : rectangle
};

create type typeGeoTag if not exists as open {
stateID: int32,
stateName: string,
countyID: int32,
countyName: string,
cityID: int32?,
cityName: string?
};

create type typeTweet if not exists as open {
create_at : datetime,
id: int64,
"text": string,
in_reply_to_status : int64,
in_reply_to_user : int64,
favorite_count : int64,
coordinate: point?,
retweet_count : int64,
lang : string,
is_retweet: boolean,
hashtags : {{ string }} ?,
user_mentions : {{ int64 }} ? ,
user : typeUser,
place : typePlace?,
geo_tag: typeGeoTag
};

create dataset ds_tweet(typeTweet) if not exists primary key id with filter on 
create_at;
{code}

For the following query, the logical plan shows empty min[] and two variables 
in max[] when doing an inverted-index search. 

{code}
USE twitter;
SELECT spatial_cell(get_points(place.bounding_box)[0], 
create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet
WHERE ftcontains(text, ['trump'], {'mode':'any'}) AND place.bounding_box IS NOT 
unknown 
AND datetime('2017-02-25T00:00:00') <= create_at AND  create_at < 
datetime('2017-02-26T00:00:00')
GROUP BY cell;
{code}

Exact predicates on the filter
{code}
datetime('2017-02-25T00:00:00') <= create_at AND  create_at < 
datetime('2017-02-26T00:00:00')
{code}

{code}
unnest-map [$$64, $$69, $$70] <- index-search("text_idx", 2, "twitter", 
"ds_tweet", FALSE, FALSE, 5, null, 21, TRUE, 1, $$63) with filter on min:[] 
max:[$$67, $$68]
-- 
SINGLE_PARTITION_INVERTED_INDEX_SEARCH  |PARTITIONED|
  exchange
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
assign [$$67, $$68, $$63] <- 
[datetime: { 2017-02-26T00:00:00.000Z }, datetime: { 2017-02-25T00:00:00.000Z 
}, array: [ "trump" ]]
-- ASSIGN  |PARTITIONED|
  empty-tuple-source
  -- EMPTY_TUPLE_SOURCE  
|PARTITIONED|
{code}


However, for the following query(just switched the location of datetime and 
create_at at the end of the predicates), it shows the correct plan.

{code}
SELECT spatial_cell(get_points(place.bounding_box)[0], 
create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet
WHERE ftcontains(text, ['trump'], {'mode':'any'}) AND place.bounding_box IS NOT 
unknown 
AND datetime('2017-02-25T00:00:00') <= create_at AND  
datetime('2017-02-26T00:00:00') > create_at
GROUP BY cell;
{code}

Exact predicates on the filter:
{code}
datetime('2017-02-25T00:00:00') <= create_at AND  
datetime('2017-02-26T00:00:00') > create_at
{code}

{code}
unnest-map [$$64, $$69, $$70] <- index-search("text_idx", 2, "twitter", 
"ds_tweet", FALSE, FALSE, 5, null, 21, TRUE, 1, $$63) with filter on min:[$$67] 
max:[$$68]
-- 
SINGLE_PARTITION_INVERTED_INDEX_SEARCH  |PARTITIONED|
  exchange
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
assign [$$67, $$68, $$63] <- 
[datetime: { 2017-02-26T00:00:00.000Z }, datetime: { 2017-02-25T00:00:00.000Z 
}, array: [ "trump" ]]
-- ASSIGN  |PARTITIONED|
  empty-tuple-source
  -- EMPTY_TUPLE_SOURCE  
|PARTITIONED|
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ASTERIXDB-2185) Cluster becomes UNUSABLE status after a NC fails to send a job failure.

2017-12-04 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2185:
-

 Summary: Cluster becomes UNUSABLE status after a NC fails to send 
a job failure.
 Key: ASTERIXDB-2185
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2185
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


A cluster became UNUSABLE status after a NC failed to send a job failure 
message. See the exception below.

{code}
Dec 03, 2017 6:47:13 PM org.apache.hyracks.control.nc.work.StartTasksWork run
INFO: Initializing TAID:TID:ANID:ODID:16:0:1:0 -> [Asterix {
  ets;
  assign [0, 1, 2] := [Constant, Constant, Constant];
}, 
org.apache.hyracks.storage.am.lsm.invertedindex.dataflow.LSMInvertedIndexSearchOperatorDescriptor@23d902c1,
 org.apache.hyracks.dataflow.std.sort.ExternalSort
OperatorDescriptor$1@2fc09944]
Dec 03, 2017 6:47:13 PM 
org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1
 close
INFO: InitialNumberOfRuns:0
Dec 03, 2017 6:47:13 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:13:0:1:0
Dec 03, 2017 6:47:13 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:13:0:0:0
Dec 03, 2017 6:47:13 PM 
org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$SortActivity$1
 close
INFO: InitialNumberOfRuns:0
Dec 03, 2017 6:47:13 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:16:0:0:0
Dec 03, 2017 6:47:13 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:16:0:1:0
Dec 03, 2017 6:48:02 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: AbortTasks
Dec 03, 2017 6:48:02 PM org.apache.hyracks.control.nc.work.AbortTasksWork run
INFO: Aborting Tasks: JID:0:[TAID:TID:ANID:ODID:0:0:0:0, 
TAID:TID:ANID:ODID:3:0:0:0, TAID:TID:ANID:ODID:3:0:1:0]
Dec 03, 2017 6:48:02 PM org.apache.hyracks.control.nc.Task run
WARNING: Task TAID:TID:ANID:ODID:3:0:0:0 failed with exception
java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1302)
at java.util.concurrent.Semaphore.acquire(Semaphore.java:467)
at org.apache.hyracks.control.nc.Task.run(Task.java:325)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)

Dec 03, 2017 6:48:02 PM org.apache.hyracks.control.nc.Task run
WARNING: Task TAID:TID:ANID:ODID:3:0:1:0 failed with exception
java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1302)
at java.util.concurrent.Semaphore.acquire(Semaphore.java:467)
at org.apache.hyracks.control.nc.Task.run(Task.java:325)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)

Dec 03, 2017 6:48:02 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskFailure
Dec 03, 2017 6:48:02 PM 
org.apache.hyracks.control.nc.work.NotifyTaskFailureWork run
WARNING: 1 is sending a notification to cc that task TAID:TID:ANID:ODID:3:0:0:0 
has failed
org.apache.hyracks.api.exceptions.HyracksDataException: HYR0003: 
java.lang.InterruptedException
at 
org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:68)
at org.apache.hyracks.control.nc.Task.run(Task.java:367)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1302)
at java.util.concurrent.Semaphore.acquire(Semaphore.java:467)
at org.apache.hyracks.control.nc.Task.run(Task.java:325)
... 3 more


.. Same exception was repeated for several times ..

Dec 03, 2017 6:48:02 PM 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskFailure
Dec 03, 2017 6:48:02 PM 
org.apache.hyracks.control.nc.work.NotifyTaskFailureWork run
WARNING: 1 is sending a notification to cc that task TAID:TID:ANID:ODID:3:0:0:0 
has failed

[jira] [Commented] (ASTERIXDB-2183) LSMHarness merge fails

2017-11-30 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273994#comment-16273994
 ] 

Taewoo Kim commented on ASTERIXDB-2183:
---

The issue is caused by the inverted list, I think. Check 
OnDiskInvertedIndexRangeSearchCursor class. It executes pinPages() to load 
entire pages of a token at once.

> LSMHarness merge fails
> --
>
> Key: ASTERIXDB-2183
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2183
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>
> LSMHarness merge() operation fails on indexes with the following error. All 
> index types are affected.
> Nov 30, 2017 6:58:07 PM 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness merge
> SEVERE: Failed merge operation on {"class" : "LSMInvertedIndex", "dir" : 
> "/mnt/ssd/scratch/waans11/asterixdb/io1/storage/partition_4/twitter/ds_tweet_idx_text_idx",
>  "memory" : 2, "disk" : 894}
> java.lang.Error: Maximum lock count exceeded
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:528)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:488)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
> at 
> org.apache.hyracks.storage.common.buffercache.CachedPage.acquireReadLatch(CachedPage.java:120)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.acquireLatch(BTree.java:542)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.performOp(BTree.java:570)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.search(BTree.java:198)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.access$300(BTree.java:69)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree$BTreeAccessor.search(BTree.java:902)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndexRangeSearchCursor.open(OnDiskInvertedIndexRangeSearchCursor.java:74)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.rangeSearch(OnDiskInvertedIndex.java:463)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexRangeSearchCursor.open(LSMInvertedIndexRangeSearchCursor.java:68)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndex.search(LSMInvertedIndex.java:223)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndex.doMerge(LSMInvertedIndex.java:353)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.merge(AbstractLSMIndex.java:671)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:574)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.merge(LSMInvertedIndexAccessor.java:124)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:45)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:30)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:744)
> This error was generated continuously until the node becomes inactive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ASTERIXDB-2183) LSMHarness merge fails

2017-11-30 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273873#comment-16273873
 ] 

Taewoo Kim commented on ASTERIXDB-2183:
---

I think this is related to the inverted index search since it is trying to hold 
the entire page of the given inverted index in the buffer cache at once. When 
the number of pages on the disk becomes larger, lock count will be increased. 
Java documentation 
(https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html)
 says that the maximum number of locks is 65,536. So, if the number of 
concurrent locks becomes 65,536, the AsterixDB can't handle this. 

> LSMHarness merge fails
> --
>
> Key: ASTERIXDB-2183
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2183
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>
> LSMHarness merge() operation fails on indexes with the following error. All 
> index types are affected.
> Nov 30, 2017 6:58:07 PM 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness merge
> SEVERE: Failed merge operation on {"class" : "LSMInvertedIndex", "dir" : 
> "/mnt/ssd/scratch/waans11/asterixdb/io1/storage/partition_4/twitter/ds_tweet_idx_text_idx",
>  "memory" : 2, "disk" : 894}
> java.lang.Error: Maximum lock count exceeded
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:528)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:488)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
> at 
> org.apache.hyracks.storage.common.buffercache.CachedPage.acquireReadLatch(CachedPage.java:120)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.acquireLatch(BTree.java:542)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.performOp(BTree.java:570)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.search(BTree.java:198)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree.access$300(BTree.java:69)
> at 
> org.apache.hyracks.storage.am.btree.impls.BTree$BTreeAccessor.search(BTree.java:902)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndexRangeSearchCursor.open(OnDiskInvertedIndexRangeSearchCursor.java:74)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.rangeSearch(OnDiskInvertedIndex.java:463)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexRangeSearchCursor.open(LSMInvertedIndexRangeSearchCursor.java:68)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndex.search(LSMInvertedIndex.java:223)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndex.doMerge(LSMInvertedIndex.java:353)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.merge(AbstractLSMIndex.java:671)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:574)
> at 
> org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.merge(LSMInvertedIndexAccessor.java:124)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:45)
> at 
> org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:30)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:744)
> This error was generated continuously until the node becomes inactive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ASTERIXDB-2183) LSMHarness merge fails

2017-11-30 Thread Taewoo Kim (JIRA)
Taewoo Kim created ASTERIXDB-2183:
-

 Summary: LSMHarness merge fails
 Key: ASTERIXDB-2183
 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2183
 Project: Apache AsterixDB
  Issue Type: Bug
Reporter: Taewoo Kim


LSMHarness merge() operation fails on indexes with the following error. All 
index types are affected.

Nov 30, 2017 6:58:07 PM 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness merge
SEVERE: Failed merge operation on {"class" : "LSMInvertedIndex", "dir" : 
"/mnt/ssd/scratch/waans11/asterixdb/io1/storage/partition_4/twitter/ds_tweet_idx_text_idx",
 "memory" : 2, "disk" : 894}
java.lang.Error: Maximum lock count exceeded
at 
java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:528)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:488)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
at 
org.apache.hyracks.storage.common.buffercache.CachedPage.acquireReadLatch(CachedPage.java:120)
at 
org.apache.hyracks.storage.am.btree.impls.BTree.acquireLatch(BTree.java:542)
at 
org.apache.hyracks.storage.am.btree.impls.BTree.performOp(BTree.java:570)
at 
org.apache.hyracks.storage.am.btree.impls.BTree.search(BTree.java:198)
at 
org.apache.hyracks.storage.am.btree.impls.BTree.access$300(BTree.java:69)
at 
org.apache.hyracks.storage.am.btree.impls.BTree$BTreeAccessor.search(BTree.java:902)
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndexRangeSearchCursor.open(OnDiskInvertedIndexRangeSearchCursor.java:74)
at 
org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.rangeSearch(OnDiskInvertedIndex.java:463)
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexRangeSearchCursor.open(LSMInvertedIndexRangeSearchCursor.java:68)
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndex.search(LSMInvertedIndex.java:223)
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndex.doMerge(LSMInvertedIndex.java:353)
at 
org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndex.merge(AbstractLSMIndex.java:671)
at 
org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.merge(LSMHarness.java:574)
at 
org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexAccessor.merge(LSMInvertedIndexAccessor.java:124)
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:45)
at 
org.apache.hyracks.storage.am.lsm.common.impls.MergeOperation.call(MergeOperation.java:30)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)

This error was generated continuously until the node becomes inactive.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ASTERIXDB-2176) Deletion doesn't work on the RTree index.

2017-11-27 Thread Taewoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim reassigned ASTERIXDB-2176:
-

Assignee: Chen Luo

> Deletion doesn't work on the RTree index.
> -
>
> Key: ASTERIXDB-2176
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2176
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>Assignee: Chen Luo
>
> This is a simplified version of "upsert/primary-secondary-tree" AQL test case.
> spatialData.json file
> { "id": 12, "point": point("6.0,3.0") }
> moreSpatialData.json file
> {"id": 12, "point": point("4.1,7.0")}
> DDL: 
> {code}
> drop dataverse test if exists;
> create dataverse test;
> use dataverse test;
> create type MyRecord as closed {
>   id: int64,
>   point: point
> }
> create dataset UpsertTo(MyRecord)
>  primary key id;
> create dataset UpsertFrom(MyRecord)
>  primary key id;
> create index rtree_index_point on UpsertTo(point) type rtree;
> {code}
> DML
> {code}
> load dataset UpsertTo
> using localfs
> (("path"="asterix_nc1://data/spatial/spatialData.json"),("format"="adm"));
> load dataset UpsertFrom
> using localfs
> (("path"="asterix_nc1://data/spatial/moreSpatialData.json"),("format"="adm"));
> upsert into dataset UpsertTo(
> for $x in dataset UpsertFrom
> return $x
> );
> for $o in dataset('UpsertTo')
> where spatial-intersect($o.point, 
> create-polygon([4.0,1.0,4.0,4.0,12.0,4.0,12.0,1.0]))
> order by $o.id
> return $o;
> {code}
> This DML returns the new record correctly. But, the issue is that the indexed 
> value in RTree has not been updated.
> When searching the rtree_index_point index, the searcher sees the previous 
> value - point("6.0,3.0"), not the new value - point("4.1,7.0"). So, this 
> record will be fetched from the primary index. However, the primary index 
> search returns the updated value and the select() verifies the value. It 
> returns true by coincidence because the new value satisfies the 
> spatial-intersect() condition.
> The secondary index search should see the updated value, not the previous 
> value. And this is an issue for the index-only plan case since it only uses 
> the value from a secondary index.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ASTERIXDB-2176) Deletion doesn't work on the RTree index.

2017-11-24 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265520#comment-16265520
 ] 

Taewoo Kim commented on ASTERIXDB-2176:
---

Adding one more point:

A sqlpp execution test case: "scan-delete-rtree-secondary-index" conducts the 
following:

(1) creates a dataset 
(2) loads 21 records. (id = 1 ~ 21)
(3) creates a RTree index
(4) deletes 10 records (id > 10) 
(5) query

On the master branch, an RTree search reveals id: 12 and 20. These records are 
filtered by the primary-index search since it cannot find the record 12 and 20. 
So it's not seen. Overall, I think deletion on RTree doesn't work now. And it 
is hidden because of the primary-index search.

> Deletion doesn't work on the RTree index.
> -
>
> Key: ASTERIXDB-2176
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2176
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>
> This is a simplified version of "upsert/primary-secondary-tree" AQL test case.
> spatialData.json file
> { "id": 12, "point": point("6.0,3.0") }
> moreSpatialData.json file
> {"id": 12, "point": point("4.1,7.0")}
> DDL: 
> {code}
> drop dataverse test if exists;
> create dataverse test;
> use dataverse test;
> create type MyRecord as closed {
>   id: int64,
>   point: point
> }
> create dataset UpsertTo(MyRecord)
>  primary key id;
> create dataset UpsertFrom(MyRecord)
>  primary key id;
> create index rtree_index_point on UpsertTo(point) type rtree;
> {code}
> DML
> {code}
> load dataset UpsertTo
> using localfs
> (("path"="asterix_nc1://data/spatial/spatialData.json"),("format"="adm"));
> load dataset UpsertFrom
> using localfs
> (("path"="asterix_nc1://data/spatial/moreSpatialData.json"),("format"="adm"));
> upsert into dataset UpsertTo(
> for $x in dataset UpsertFrom
> return $x
> );
> for $o in dataset('UpsertTo')
> where spatial-intersect($o.point, 
> create-polygon([4.0,1.0,4.0,4.0,12.0,4.0,12.0,1.0]))
> order by $o.id
> return $o;
> {code}
> This DML returns the new record correctly. But, the issue is that the indexed 
> value in RTree has not been updated.
> When searching the rtree_index_point index, the searcher sees the previous 
> value - point("6.0,3.0"), not the new value - point("4.1,7.0"). So, this 
> record will be fetched from the primary index. However, the primary index 
> search returns the updated value and the select() verifies the value. It 
> returns true by coincidence because the new value satisfies the 
> spatial-intersect() condition.
> The secondary index search should see the updated value, not the previous 
> value. And this is an issue for the index-only plan case since it only uses 
> the value from a secondary index.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (ASTERIXDB-2176) Deletion doesn't work on the RTree index.

2017-11-24 Thread Taewoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265520#comment-16265520
 ] 

Taewoo Kim edited comment on ASTERIXDB-2176 at 11/24/17 8:35 PM:
-

Adding one more point:

An sqlpp execution test case: "scan-delete-rtree-secondary-index" conducts the 
following:

(1) creates a dataset 
(2) loads 21 records. (id = 1 ~ 21)
(3) creates a RTree index
(4) deletes 10 records (id > 10) 
(5) spatial-intersect query

On the master branch, an RTree search reveals id: 12 and 20. These records are 
filtered by the primary-index search since it cannot find the record 12 and 20. 
So it's not seen. Overall, I think deletion on RTree doesn't work now. And it 
is hidden because of the primary-index search.


was (Author: wangsaeu):
Adding one more point:

A sqlpp execution test case: "scan-delete-rtree-secondary-index" conducts the 
following:

(1) creates a dataset 
(2) loads 21 records. (id = 1 ~ 21)
(3) creates a RTree index
(4) deletes 10 records (id > 10) 
(5) query

On the master branch, an RTree search reveals id: 12 and 20. These records are 
filtered by the primary-index search since it cannot find the record 12 and 20. 
So it's not seen. Overall, I think deletion on RTree doesn't work now. And it 
is hidden because of the primary-index search.

> Deletion doesn't work on the RTree index.
> -
>
> Key: ASTERIXDB-2176
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2176
> Project: Apache AsterixDB
>  Issue Type: Bug
>Reporter: Taewoo Kim
>
> This is a simplified version of "upsert/primary-secondary-tree" AQL test case.
> spatialData.json file
> { "id": 12, "point": point("6.0,3.0") }
> moreSpatialData.json file
> {"id": 12, "point": point("4.1,7.0")}
> DDL: 
> {code}
> drop dataverse test if exists;
> create dataverse test;
> use dataverse test;
> create type MyRecord as closed {
>   id: int64,
>   point: point
> }
> create dataset UpsertTo(MyRecord)
>  primary key id;
> create dataset UpsertFrom(MyRecord)
>  primary key id;
> create index rtree_index_point on UpsertTo(point) type rtree;
> {code}
> DML
> {code}
> load dataset UpsertTo
> using localfs
> (("path"="asterix_nc1://data/spatial/spatialData.json"),("format"="adm"));
> load dataset UpsertFrom
> using localfs
> (("path"="asterix_nc1://data/spatial/moreSpatialData.json"),("format"="adm"));
> upsert into dataset UpsertTo(
> for $x in dataset UpsertFrom
> return $x
> );
> for $o in dataset('UpsertTo')
> where spatial-intersect($o.point, 
> create-polygon([4.0,1.0,4.0,4.0,12.0,4.0,12.0,1.0]))
> order by $o.id
> return $o;
> {code}
> This DML returns the new record correctly. But, the issue is that the indexed 
> value in RTree has not been updated.
> When searching the rtree_index_point index, the searcher sees the previous 
> value - point("6.0,3.0"), not the new value - point("4.1,7.0"). So, this 
> record will be fetched from the primary index. However, the primary index 
> search returns the updated value and the select() verifies the value. It 
> returns true by coincidence because the new value satisfies the 
> spatial-intersect() condition.
> The secondary index search should see the updated value, not the previous 
> value. And this is an issue for the index-only plan case since it only uses 
> the value from a secondary index.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   4   >