Re: endless query execution

Jinho Kim Tue, 19 Aug 2014 20:01:03 -0700

Hello Chris,

Can you stop the iptables for testing ? and can I see the LISTEN port
by netstat ?
You can find shuffle port in woker log.


2014-08-19 22:43:05,635 INFO
org.apache.tajo.pullserver.TajoPullServerService: httpshuffle
listening on port 56850

-Jinho
Best regards


2014-08-20 5:47 GMT+09:00 Christian Schwabe <[email protected]>:
>
> Hello Jinho,
>
> thanks for your reply.
> I take a look into /etc/hosts from my VirtualMachine and take care of it
> that there still was 127.0.0.1 localhost in it.
> Now Tajo connects to localhost.
> But there seems to be another problem too.
> The ports to those Tajo connected did not respond.
>
> So slowly I'm really a bit desperate. I can not quite understand how it is
> so hard the Tajo just running and really do not know what to do yet.
> I need this software for my thesis and would like to undergo extensive
> testing. But unfortunately this is absolutely not possible.
> Do you still have any tips or hints to make it work?
>
>
> Best regards,
> Chris
>
> Am 19.08.2014 05:08:07, schrieb Jinho Kim:
>
> Hello Christian,
>
> TajoPullServer use the local address. and we doesn’t change the loopback
> address.
> if you startup the cluster, each worker must have a unique address
> I think your /etc/hosts or virtualMachine does have ‘<127.0.1.1
> yourhostname>’
>
> -Jinho
> Best regards
>
>
> 2014-08-19 11:11 GMT+09:00 Christian Schwabe <[email protected]>:
>
>
> Hello Hyunsik,
>
> after some intensive research in the error log of Tajo I think I have found
> the error.
> First of all: Master and Worker converge on my MacBook.
> The error is exactly in the first line of the stack trace:
>
> 2014-08-19 03:56:17,334 INFO org.apache.tajo.worker.Fetcher: Status:
> FETCH_FETCHING,
> URI:http://127.0.1.1:56178/?qid=q_1408413350431_0001&sid=1&p=0&type=h&ta=0_0
> 2014-08-19 03:56:17,362 INFO
> org.apache.tajo.pullserver.TajoPullServerService: PullServer request param:
> shuffleType=h, sid=1, partId=0, taskIds=[0_0]
> 2014-08-19 03:56:17,363 INFO
> org.apache.tajo.pullserver.TajoPullServerService: PullServer baseDir:
> /tmp/tajo-christian/tmpdir/q_1408413350431_0001/output
> 2014-08-19 03:56:17,386 ERROR org.apache.tajo.worker.Fetcher: Fetch failed :
> java.lang.IllegalArgumentException: invalid version format: 404
>
>     at
> org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:102)
>     at
> org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62)
>     at
> org.jboss.netty.handler.codec.http.HttpResponseDecoder.createMessage(HttpResponseDecoder.java:104)
>     at
> org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:189)
>     at
> org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.decode(HttpClientCodec.java:143)
>     at
> org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.decode(HttpClientCodec.java:127)
>     at
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
>     at
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
>     at
> org.jboss.netty.handler.codec.http.HttpClientCodec.handleUpstream(HttpClientCodec.java:92)
>     at
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>     at
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>     at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>     at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
>     at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>     at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
>     at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:744)
>
>
> Tajo try to connect to an IP address that is not valid.
> The correct IP address on which it should connect is 127.0.0.1 but not
> 127.0.1.1
> I hope my conclusions are correct up to this point.
> Is there a chance to force Tajo on localhost?
>
> I want to reiterate that this behavior was never previously occurred.
>
> Best regards,
> Chris
>
>
>
>
> Am 17.08.2014 16:06:00, schrieb Christian Schwabe:
>
> Hello Hyunsik,
>
> thanks for your reply. Unfortunately I have never heard of "Netty" and
> therefore do not know what that is exactly. How and where is this supposed
> to check exactly?
>
>
> Best regards,
> Chris
> Am 17.08.2014 um 09:27 schrieb Hyunsik Choi <[email protected]>:
>
> Hi Chris,
>
> According to your log message, Http Server and Client implemented in Netty
> do not work correctly due to some problems. I'm expecting that your
> application includes another Netty version by other dependencies. Current
> Tajo uses 3.6.6.Final. Could you check if your dependency includes other
> netty version?
>
>
> 2014-08-14 08:46:26,795 INFO
> org.apache.tajo.pullserver.TajoPullServerService: PullServer baseDir:
> /tmp/tajo-chris/tmpdir/q_1407998756522_0001/output
> 2014-08-14 08:46:26,798 ERROR org.apache.tajo.worker.Fetcher: Fetch failed :
> java.lang.IllegalArgumentException: invalid version format: FOUND
> at
> org.jboss.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:102)
> at
> org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62)
> at
> org.jboss.netty.handler.codec.http.HttpResponseDecoder.createMessage(HttpResponseDecoder.java:104)
> at
> org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:189)
> at
> org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.decode(HttpClientCodec.java:143)
> at
> org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.decode(HttpClientCodec.java:127)
> at
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
> at
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
>
> Best regards,
> Hyunsik
>
>
> On Thu, Aug 14, 2014 at 4:04 PM, Christian Schwabe
> <[email protected]> wrote:
>
> Hello Hyunsik,
>
> Thank you for your reply.
> I'm already very happy that it is obviously not a mistake on my part and it
> is a known issue in which of course I would like help to resolve this.
> In addition I have attached the worker log.
> Glad to hear that already a committer works on the second problem.
>
> Best regards,
> Chris
>
>
>
> Am 14.08.2014 um 00:24 schrieb Hyunsik Choi <[email protected]>:
>
> There may be two problems:
> 1) One is that a simple query causes an error.
> 2) Another is endless running query.
>
> In the first problem, a query is too simple. So, I expect that there is some
> trivial problem related data or configuration. In order to help you figure
> out the problem, could you share the worker log? The worker log contains the
> detailed causes of the error.
>
> Actually, one committer is already digging into the second problem. Certain
> few cases causes a endless running query. I believe that this problem will
> be fixed soon.
>
> Thanks,
> Hyunsik
>
>
> On Thu, Aug 14, 2014 at 4:08 AM, Christian Schwabe
> <[email protected]> wrote:
>
> Hello guys,
>
> and another Mail. Sorry for spamming.
> To deal with Apache Tajo's is just so exciting, because now occur once a lot
> of questions and problems.
>
> Now I just pulled the new status from the GitHub repository. Recompiled all
> with 'mvn clean package -DskipTests -Pdist -Dtar
> The previously saved configuration copied back into place. Up to this point
> everything usual no problem and everything.
> To test the new status I execute the minimalistic example
> (http://tajo.apache.org/docs/0.8.0/jdbc_driver.html)
> I used the the following statement 'select count(*) from table1 on the
> mentioned dataset.
>
> I received the following warnings:
> 2014-08-13 20:45:04.856 java[14232:1903] Unable to load realm info from
> SCDynamicStore
> 2014-08-13 20:45:04,925 WARN: org.apache.hadoop.util.NativeCodeLoader
> (<clinit>(62)) - Unable to load native-hadoop library for your platform...
> using builtin-java classes where applicable
> 2014-08-13 20:45:06,372 WARN: org.apache.tajo.client.TajoClient
> (getQueryResultAndWait(528)) - Query (q_1407955121364_0003) failed:
> QUERY_ERROR
>
> For a look in the WebUI I see following execution errors:
>
> Finished Query
>
> QueryIdStatusStartTime FinishTimeProgressRunTime
> q_1407955121364_0003 QUERY_ERROR2014-08-13 20:45:05 2014-08-13 20:45:0650%,0
> sec
>
>
>
>
>
>
>
>
>>> Now the details for the Tajo Worker <<
>
> q_1407955121364_0003 [Query Plan]
>
> IDStateStartedFinished Running timeProgressTasks
> eb_1407955121364_0003_000001 SUCCEEDED2014-08-13 20:45:052014-08-13 20:45:05
> ,0 sec100,0%1/1
> eb_1407955121364_0003_000002 ERROR2014-08-13 20:45:052014-08-13 20:45:06 ,0
> sec,0%0/1
>
> Applied Session Variables
>
> ________________________________
>
> Logical Plan
>
> -----------------------------
> Query Block Graph
> -----------------------------
> |-#ROOT
> -----------------------------
> Optimization Log:
> [LogicalPlan]
> > ProjectionNode is eliminated.
> -----------------------------
>
> GROUP_BY(2)()
>   => exprs: (count())
>   => target list: ?count (INT8)
>   => out schema:{(1) ?count (INT8)}
>   => in schema:{(0) }
>    SCAN(0) on default.table1
>      => target list:
>      => out schema: {(0) }
>      => in schema: {(5) default.table1.id (INT4),default.table1.new_column
> (TEXT),default.table1.name (TEXT),default.table1.score
> (FLOAT4),default.table1.type (TEXT)}
>
> ________________________________
>
> Distributed Query Plan
>
> -------------------------------------------------------------------------------
> Execution Block Graph (TERMINAL - eb_1407955121364_0003_000003)
> -------------------------------------------------------------------------------
> |-eb_1407955121364_0003_000003
>    |-eb_1407955121364_0003_000002
>       |-eb_1407955121364_0003_000001
> -------------------------------------------------------------------------------
> Order of Execution
> -------------------------------------------------------------------------------
> 1: eb_1407955121364_0003_000001
> 2: eb_1407955121364_0003_000002
> 3: eb_1407955121364_0003_000003
> -------------------------------------------------------------------------------
>
> =======================================================
> Block Id: eb_1407955121364_0003_000001 [LEAF]
> =======================================================
>
> [Outgoing]
> [q_1407955121364_0003] 1 => 2 (type=HASH_SHUFFLE, key=, num=1)
>
> GROUP_BY(5)()
>   => exprs: (count())
>   => target list: ?count_1 (INT8)
>   => out schema:{(1) ?count_1 (INT8)}
>   => in schema:{(0) }
>    SCAN(0) on default.table1
>      => target list:
>      => out schema: {(0) }
>      => in schema: {(5) default.table1.id (INT4),default.table1.new_column
> (TEXT),default.table1.name (TEXT),default.table1.score
> (FLOAT4),default.table1.type (TEXT)}
>
> =======================================================
> Block Id: eb_1407955121364_0003_000002 [ROOT]
> =======================================================
>
> [Incoming]
> [q_1407955121364_0003] 1 => 2 (type=HASH_SHUFFLE, key=, num=1)
>
> GROUP_BY(2)()
>   => exprs: (count(?count_1 (INT8)))
>   => target list: ?count (INT8)
>   => out schema:{(1) ?count (INT8)}
>   => in schema:{(1) ?count_1 (INT8)}
>    SCAN(6) on eb_1407955121364_0003_000001
>      => out schema: {(1) ?count_1 (INT8)}
>      => in schema: {(1) ?count_1 (INT8)}
>
> =======================================================
> Block Id: eb_1407955121364_0003_000003 [TERMINAL]
> =======================================================
>
>
>
>
>
> eb_1407955121364_0003_000002
>
> ________________________________
>
> GROUP_BY(2)()
>   => exprs: (count(?count_1 (INT8)))
>   => target list: ?count (INT8)
>   => out schema:{(1) ?count (INT8)}
>   => in schema:{(1) ?count_1 (INT8)}
>    SCAN(6) on eb_1407955121364_0003_000001
>      => out schema: {(1) ?count_1 (INT8)}
>      => in schema: {(1) ?count_1 (INT8)}
>
>
> Status:ERROR
> Started:2014-08-13 20:45:05 ~ 2014-08-13 20:45:06
> # Tasks: 1 (Local Tasks: 0, Rack Local Tasks: 0)
> Progress:,0%
> # Shuffles:0
> Input Bytes: 0 B (0 B)
> Actual Processed Bytes:-
> Input Rows:0
> Output Bytes: 0 B (0 B)
> Output Rows:0
> ________________________________
> Status:          ALL         SCHEDULED         RUNNING         SUCCEEDED
> NoIdStatusProgress StartedRunning TimeHost
> 1 t_1407955121364_0003_000002_000000RUNNING,0%2014-08-13 20:45:05 1054768
> mschristians-mbp.fritz.box
>
>
>
>
>
> eb_1407955121364_0003_000002
>
> ________________________________
> IDt_1407955121364_0003_000002_000000
> Progress,0%
> StateRUNNING
> Launch Time2014-08-13 20:45:05
> Finish Time-
> Running Time 1116702 ms
> Hostchristians-mbp.fritz.box
> Shuffles# Shuffle Outputs: 0, Shuffle Key: -, Shuffle file: -
> Data Locations DataLocation{host=unknown, volumeId=-1}
> Fragment"fragment": {"id": "eb_1407955121364_0003_000001", "path":
> file:/tmp/tajo-chris/warehouse/eb_1407955121364_0003_000001", "start":
> 0,"length": 0}
> Input StatisticsNo input statistics
> Output StatisticsNo input statistics
> Fetches eb_1407955121364_0003_000001
> http://192.168.178.101:56834/?qid=q_1407955121364_0003&sid=1&p=0&type=h
>
>
>
>>> As u can see here the query is still running and running, like an endless
>>> loop. I dont no whats wrong with it. Its a simple query.
> But the strange thing is that the same query is running correctly from the
> console.
> I hope this was not too much information for this moment. But I think these
> are the minimum necessary logs you need to understand the described error.
> While I describe this error here the query just continue now been 21
> minutes.
>
>
> Best regards,
> Chris
>
>
>
>
>
>
>
>
>
>
>
>

Re: endless query execution

Reply via email to