Hello guys, and another Mail. Sorry for spamming. To deal with Apache Tajo's is just so exciting, because now occur once a lot of questions and problems.
Now I just pulled the new status from the GitHub repository. Recompiled all with 'mvn clean package -DskipTests -Pdist -Dtar‘ The previously saved configuration copied back into place. Up to this point everything usual no problem and everything. To test the new status I execute the minimalistic example (http://tajo.apache.org/docs/0.8.0/jdbc_driver.html) I used the the following statement 'select count(*) from table1‘ on the mentioned dataset. I received the following warnings: 2014-08-13 20:45:04.856 java[14232:1903] Unable to load realm info from SCDynamicStore 2014-08-13 20:45:04,925 WARN: org.apache.hadoop.util.NativeCodeLoader (<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2014-08-13 20:45:06,372 WARN: org.apache.tajo.client.TajoClient (getQueryResultAndWait(528)) - Query (q_1407955121364_0003) failed: QUERY_ERROR For a look in the WebUI I see following execution errors: Finished Query QueryId Status StartTime FinishTime Progress RunTime q_1407955121364_0003 QUERY_ERROR 2014-08-13 20:45:05 2014-08-13 20:45:06 50% ,0 sec >> Now the details for the Tajo Worker << q_1407955121364_0003 [Query Plan] ID State Started Finished Running time Progress Tasks eb_1407955121364_0003_000001 SUCCEEDED 2014-08-13 20:45:05 2014-08-13 20:45:05 ,0 sec 100,0% 1/1 eb_1407955121364_0003_000002 ERROR 2014-08-13 20:45:05 2014-08-13 20:45:06 ,0 sec ,0% 0/1 Applied Session Variables Logical Plan ----------------------------- Query Block Graph ----------------------------- |-#ROOT ----------------------------- Optimization Log: [LogicalPlan] > ProjectionNode is eliminated. ----------------------------- GROUP_BY(2)() => exprs: (count()) => target list: ?count (INT8) => out schema:{(1) ?count (INT8)} => in schema:{(0) } SCAN(0) on default.table1 => target list: => out schema: {(0) } => in schema: {(5) default.table1.id (INT4),default.table1.new_column (TEXT),default.table1.name (TEXT),default.table1.score (FLOAT4),default.table1.type (TEXT)} Distributed Query Plan ------------------------------------------------------------------------------- Execution Block Graph (TERMINAL - eb_1407955121364_0003_000003) ------------------------------------------------------------------------------- |-eb_1407955121364_0003_000003 |-eb_1407955121364_0003_000002 |-eb_1407955121364_0003_000001 ------------------------------------------------------------------------------- Order of Execution ------------------------------------------------------------------------------- 1: eb_1407955121364_0003_000001 2: eb_1407955121364_0003_000002 3: eb_1407955121364_0003_000003 ------------------------------------------------------------------------------- ======================================================= Block Id: eb_1407955121364_0003_000001 [LEAF] ======================================================= [Outgoing] [q_1407955121364_0003] 1 => 2 (type=HASH_SHUFFLE, key=, num=1) GROUP_BY(5)() => exprs: (count()) => target list: ?count_1 (INT8) => out schema:{(1) ?count_1 (INT8)} => in schema:{(0) } SCAN(0) on default.table1 => target list: => out schema: {(0) } => in schema: {(5) default.table1.id (INT4),default.table1.new_column (TEXT),default.table1.name (TEXT),default.table1.score (FLOAT4),default.table1.type (TEXT)} ======================================================= Block Id: eb_1407955121364_0003_000002 [ROOT] ======================================================= [Incoming] [q_1407955121364_0003] 1 => 2 (type=HASH_SHUFFLE, key=, num=1) GROUP_BY(2)() => exprs: (count(?count_1 (INT8))) => target list: ?count (INT8) => out schema:{(1) ?count (INT8)} => in schema:{(1) ?count_1 (INT8)} SCAN(6) on eb_1407955121364_0003_000001 => out schema: {(1) ?count_1 (INT8)} => in schema: {(1) ?count_1 (INT8)} ======================================================= Block Id: eb_1407955121364_0003_000003 [TERMINAL] ======================================================= eb_1407955121364_0003_000002 GROUP_BY(2)() => exprs: (count(?count_1 (INT8))) => target list: ?count (INT8) => out schema:{(1) ?count (INT8)} => in schema:{(1) ?count_1 (INT8)} SCAN(6) on eb_1407955121364_0003_000001 => out schema: {(1) ?count_1 (INT8)} => in schema: {(1) ?count_1 (INT8)} Status: ERROR Started: 2014-08-13 20:45:05 ~ 2014-08-13 20:45:06 # Tasks: 1 (Local Tasks: 0, Rack Local Tasks: 0) Progress: ,0% # Shuffles: 0 Input Bytes: 0 B (0 B) Actual Processed Bytes: - Input Rows: 0 Output Bytes: 0 B (0 B) Output Rows: 0 Status: No Id Status Progress Started Running Time Host 1 t_1407955121364_0003_000002_000000 RUNNING ,0% 2014-08-13 20:45:05 1054768 ms christians-mbp.fritz.box eb_1407955121364_0003_000002 ID t_1407955121364_0003_000002_000000 Progress ,0% State RUNNING Launch Time 2014-08-13 20:45:05 Finish Time - Running Time 1116702 ms Host christians-mbp.fritz.box Shuffles # Shuffle Outputs: 0, Shuffle Key: -, Shuffle file: - Data Locations DataLocation{host=unknown, volumeId=-1} Fragment "fragment": {"id": "eb_1407955121364_0003_000001", "path": file:/tmp/tajo-chris/warehouse/eb_1407955121364_0003_000001", "start": 0,"length": 0} Input Statistics No input statistics Output Statistics No input statistics Fetches eb_1407955121364_0003_000001 http://192.168.178.101:56834/?qid=q_1407955121364_0003&sid=1&p=0&type=h >> As u can see here the query is still running and running, like an endless >> loop. I don’t no what’s wrong with it. It’s a simple query. But the strange thing is that the same query is running correctly from the console. I hope this was not too much information for this moment. But I think these are the minimum necessary logs you need to understand the described error. While I describe this error here the query just continue now been 21 minutes. Best regards, Chris
