[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397688#comment-16397688
 ] 

Murtadha Hubail commented on ASTERIXDB-2326:
--------------------------------------------

[~James Fang],

AsterixHyracksIntegrationUtil is configured to remove any result reference 
after two minutes. So, any query execution that exceeds that will end up with 
that exception. You need to change the following configurations in 
AsterixHyracksIntegrationUtil class to a very large value then the query should 
work. You would also delete them which will result in using the default values 
of 24 hours before removing the result.

ccConfig.setResultTTL(120000L);

ncConfig.setResultTTL(120000L);

> Cannot run aggregation functions when the external dataset size grows too 
> large
> -------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-2326
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2326
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: EXT - External data, FUN - Functions
>            Reporter: James Fang
>            Assignee: Murtadha Hubail
>            Priority: Major
>
> I was testing aggregation functions on external data, and found that the 
> aggregation functions would not work at all at 100 million tuples. At 
> 10million tuples, the aggregates worked. None of the existing aggregates or 
> the aggregates I am adding will work for 100 million tuples. 
> DDL:
> DROP DATAVERSE AGG_TEST IF EXISTS;
> CREATE DATAVERSE AGG_TEST;
> USE AGG_TEST;
> CREATE TYPE Data AS {
>  id: int,
>  val: double
> };
> create external dataset dataval(Data) using 
> localfs((`path`=`127.0.0.1://Users/name/Documents/100000000.txt`),(`format`=`adm`));
>  
> Query:
> USE AGG_TEST;
> {"average":coll_avg((select element x.val from dataval as x))};
>  
> Error:
> 11:55:25.603 [Executor-3:ClusterController] INFO  
> org.apache.asterix.runtime.utils.ClusterStateManager - Cluster State is now 
> ACTIVE
> 11:55:30.447 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> GetDatasetDirectoryServiceInfo
> 11:55:30.917 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> GetNodeControllersInfo
> 11:55:31.345 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: JobStart
> 11:55:31.379 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService - 
> DatasetDirectoryService notified of new job JID:0.1
> 11:55:31.382 [Worker:ClusterController] INFO  
> org.apache.asterix.app.active.ActiveNotificationHandler - 
> notifyJobCreation(JobId jobId, JobSpecification jobSpecification) was called 
> with jobId = JID:0.1
> 11:55:31.382 [Worker:ClusterController] INFO  
> org.apache.asterix.app.active.ActiveNotificationHandler - Job is not of type 
> active job. property found to be: null
> 11:55:31.393 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.executor.ActivityClusterPlanner - Plan for 
> org.apache.hyracks.api.job.ActivityCluster@1264c6ff
> 11:55:31.393 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.executor.ActivityClusterPlanner - Built 1 Task 
> Clusters
> 11:55:31.393 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.executor.ActivityClusterPlanner - Tasks: 
> [TID:ANID:ODID:0:0:0, TID:ANID:ODID:2:0:0]
> 11:55:31.394 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.executor.JobExecutor - Runnable TC roots: 
> [TC:[TID:ANID:ODID:0:0:0, TID:ANID:ODID:2:0:0]], inProgressTaskClusters: []
> 11:55:31.412 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> WaitForJobCompletion
> 11:55:31.412 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: StartTasks
> 11:55:31.423 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.nc.work.StartTasksWork - Initializing 
> TAID:TID:ANID:ODID:0:0:0:0 -> 
> [org.apache.asterix.external.operators.ExternalScanOperatorDescriptor@74fb82e0,
>  AlgebricksMeta [assign [1] := 
> [org.apache.asterix.runtime.evaluators.functions.records.FieldAccessByIndexEvalFactory$_EvaluatorFactoryGen@30d487a5],
>  stream-project [1], assign 
> [org.apache.asterix.runtime.aggregates.std.LocalAvgAggregateDescriptor$2@6594e4ce]]]
>  for JID:0.1
> 11:55:31.450 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.nc.work.StartTasksWork - input: 0: CDID:1
> 11:55:31.453 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.nc.work.StartTasksWork - Initializing 
> TAID:TID:ANID:ODID:2:0:0:0 -> 
> [org.apache.hyracks.dataflow.std.result.ResultWriterOperatorDescriptor@71b17102,
>  AlgebricksMeta [assign 
> [org.apache.asterix.runtime.aggregates.std.GlobalAvgAggregateDescriptor$2@11121dfc],
>  assign [1] := 
> [org.apache.asterix.runtime.evaluators.common.ClosedRecordConstructorEvalFactory@443a919b],
>  stream-project [1]]] for JID:0.1
> 11:55:31.480 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.nc.work.StartTasksWork - input: 0: CDID:1
> 11:55:31.517 
> [org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:2:0:0:0:0]
>  INFO  org.apache.hyracks.control.nc.dataset.DatasetPartitionWriter - open(0)
> 12:00:57.342 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:0:0:0:0
> 12:00:57.351 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: TaskComplete: 
> [asterix_nc1[JID:0.1:TAID:TID:ANID:ODID:0:0:0:0]
> 12:00:57.365 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> RegisterResultPartitionLocation: JobId@JID:0.1 ResultSetId@RSID:0 Partition@0 
> NPartitions@1 
> [ResultPartitionLocation@127.0.0.1:49695|http://ResultPartitionLocation@127.0.0.1:49695/]
>  OrderedResult@true EmptyResult@false
> 12:00:57.368 
> [org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:2:0:0:0:0]
>  INFO  org.apache.hyracks.control.nc.dataset.DatasetPartitionWriter - close(0)
> 12:00:57.373 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:2:0:0:0
> 12:00:57.377 [Worker:ClusterController] WARN  
> org.apache.hyracks.control.cc.work.RegisterResultPartitionLocationWork - 
> Failed to register partition location
> org.apache.hyracks.api.exceptions.HyracksDataException: HYR0024: No result 
> set for job JID:0.1
> at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:55)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.getNonNullDatasetJobRecord(DatasetDirectoryService.java:105)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.registerResultPartitionLocation(DatasetDirectoryService.java:114)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.work.RegisterResultPartitionLocationWork.run(RegisterResultPartitionLocationWork.java:71)
>  [classes/:?]
> at 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127)
>  [classes/:?]
> 12:00:57.393 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.executor.JobExecutor - Abort map for job: 
> JID:0.1: \{asterix_nc1=[TAID:TID:ANID:ODID:2:0:0:0]}
> 12:00:57.394 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.executor.JobExecutor - Aborting: 
> [TAID:TID:ANID:ODID:2:0:0:0] at asterix_nc1
> 12:00:57.400 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.partitions.PartitionMatchMaker - Removing 
> uncommitted partitions: []
> 12:00:57.405 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.partitions.PartitionMatchMaker - Removing 
> partition requests: []
> 12:00:57.407 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> ReportResultPartitionWriteCompletion: JobId@JID:0.1 ResultSetId@RSID:0 
> Partition@0
> 12:00:57.407 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: AbortTasks
> 12:00:57.407 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.nc.work.AbortTasksWork - Aborting Tasks: 
> JID:0.1:[TAID:TID:ANID:ODID:2:0:0:0]
> 12:00:57.407 [Worker:ClusterController] WARN  
> org.apache.hyracks.control.common.work.WorkQueue - Exception while executing 
> ReportResultPartitionWriteCompletion: JobId@JID:0.1 ResultSetId@RSID:0 
> Partition@0
> java.lang.RuntimeException: 
> org.apache.hyracks.api.exceptions.HyracksDataException: HYR0024: No result 
> set for job JID:0.1
> at 
> org.apache.hyracks.control.cc.work.ReportResultPartitionWriteCompletionWork.run(ReportResultPartitionWriteCompletionWork.java:49)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127)
>  [classes/:?]
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: HYR0024: 
> No result set for job JID:0.1
> at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:55)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.getNonNullDatasetJobRecord(DatasetDirectoryService.java:105)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.reportResultPartitionWriteCompletion(DatasetDirectoryService.java:141)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.work.ReportResultPartitionWriteCompletionWork.run(ReportResultPartitionWriteCompletionWork.java:47)
>  ~[classes/:?]
> ... 1 more
> 12:00:57.408 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: TaskComplete: 
> [asterix_nc1[JID:0.1:TAID:TID:ANID:ODID:2:0:0:0]
> 12:00:57.409 [Worker:ClusterController] WARN  
> org.apache.hyracks.control.cc.executor.JobExecutor - Spurious task complete 
> notification: TAID:TID:ANID:ODID:2:0:0:0 Current state = ABORTED
> 12:00:57.409 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: JobCleanup: 
> JobId@JID:0.1 Status@FAILURE 
> Exceptions@[org.apache.hyracks.api.exceptions.HyracksDataException: HYR0024: 
> No result set for job JID:0.1]
> 12:00:57.409 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.cc.work.JobCleanupWork - Cleanup for JobRun with 
> id: JID:0.1
> 12:00:57.412 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: CleanupJoblet
> 12:00:57.413 [Worker:asterix_nc1] INFO  
> org.apache.hyracks.control.nc.work.CleanupJobletWork - Cleaning up after job: 
> JID:0.1
> 12:00:57.416 [Worker:asterix_nc1] INFO  org.apache.hyracks.control.nc.Joblet 
> - Freeing leaked 294912 bytes
> 12:00:57.421 [Worker:ClusterController] INFO  
> org.apache.hyracks.control.common.work.WorkQueue - Executing: 
> JobletCleanupNotification
> 12:00:57.421 [Worker:ClusterController] INFO  
> org.apache.asterix.app.active.ActiveNotificationHandler - Getting notified of 
> job finish for JobId: JID:0.1
> 12:00:57.421 [Worker:ClusterController] INFO  
> org.apache.asterix.app.active.ActiveNotificationHandler - NO NEED TO NOTIFY 
> JOB FINISH!
> 12:00:57.430 [IPC Network Listener Thread [/0:0:0:0:0:0:0:0:49684]] INFO  
> org.apache.hyracks.ipc.impl.IPCSystem - Exception in message
> org.apache.hyracks.api.exceptions.HyracksDataException: HYR0024: No result 
> set for job JID:0.1
> at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:55)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.getNonNullDatasetJobRecord(DatasetDirectoryService.java:105)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.registerResultPartitionLocation(DatasetDirectoryService.java:114)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.work.RegisterResultPartitionLocationWork.run(RegisterResultPartitionLocationWork.java:71)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127)
>  ~[classes/:?]
> 12:00:57.436 [HttpExecutor(port:19001)-0] ERROR org.apache.asterix - HYR0024: 
> No result set for job JID:0.1
> org.apache.hyracks.api.exceptions.HyracksDataException: HYR0024: No result 
> set for job JID:0.1
> at 
> org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:55)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.getNonNullDatasetJobRecord(DatasetDirectoryService.java:105)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.dataset.DatasetDirectoryService.registerResultPartitionLocation(DatasetDirectoryService.java:114)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.cc.work.RegisterResultPartitionLocationWork.run(RegisterResultPartitionLocationWork.java:71)
>  ~[classes/:?]
> at 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127)
>  ~[classes/:?]
> 12:00:57.442 [Worker:ClusterController] WARN  
> org.apache.hyracks.control.common.work.WorkQueue - Work 
> JobletCleanupNotification waited 0 times (~0ms), blocked 1 times (~0ms)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to