Re: Configure Hive in Cluster

Nitin Pawar Wed, 23 Jan 2013 00:00:37 -0800

this is the error on hadoop job

2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask:
Failed to fetch map-output from attempt_201301231151_0002_m_000001_0
even after MAX_FETCH_RETRIES_PER_MAP retries...  or it is a read
error,  reporting to the JobTracker
2013-01-23 12:15:44,885 FATAL org.apache.hadoop.mapred.ReduceTask:
Shuffle failed with too many fetch failures and insufficient
progress!Killing task attempt_201301231151_0002_r_000000_0.


2013-01-23 12:15:45,220 FATAL org.apache.hadoop.mapred.Task: Failed to
contact the tasktracker
org.apache.hadoop.ipc.RemoteException: java.io.IOException:
JvmValidate Failed. Ignoring request from task:
attempt_201301231151_0002_r_000000_0, with JvmId:
jvm_201301231151_0002_r_1079250852


so something is a mess either your network went down or nodes went down

hive tries to get the same task log from the host (savitha-vitualbox)
and it can't figure out what that host is.





On Wed, Jan 23, 2013 at 1:28 PM, venkatramanan <venkatraman...@smartek21.com
> wrote:

>  No, all the nodes are up and running. i dont know, when hive takes the
> other nodes "HOST NAME" thats the error i guess..
>
> revert me if am wrong
>
>
> On Wednesday 23 January 2013 01:07 PM, Nitin Pawar wrote:
>
> when you ran the query, did the VM shutdown ?
>
>
> On Wed, Jan 23, 2013 at 12:57 PM, venkatramanan <
> venkatraman...@smartek21.com> wrote:
>
>>  Hi,
>>
>> I got the following error while executing the "select count(1) from
>> tweettrend;"
>>
>> Below are the exact log msg from the jobtracker Web Interface
>>
>> *Hive Cli Error:*
>>
>> Exception in thread "Thread-21" java.lang.RuntimeException: Error while
>> reading from task log url
>>     at
>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240)
>>     at
>> org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227)
>>     at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
>>     at java.lang.Thread.run(Thread.java:722)
>> Caused by: java.net.UnknownHostException: savitha-VirtualBox
>>     at
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
>>     at java.net.Socket.connect(Socket.java:579)
>>     at java.net.Socket.connect(Socket.java:528)
>>     at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>>     at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
>>     at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
>>     at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
>>     at sun.net.www.http.HttpClient.New(HttpClient.java:290)
>>     at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>>     at
>> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
>>     at
>> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
>>     at
>> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
>>     at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1299)
>>     at java.net.URL.openStream(URL.java:1037)
>>     at
>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192)
>>     ... 3 more
>> FAILED: Execution Error, return code 2 from
>> org.apache.hadoop.hive.ql.exec.MapRedTask
>> MapReduce Jobs Launched:
>> Job 0: Map: 2  Reduce: 1   Cumulative CPU: 9.0 sec   HDFS Read: 408671053
>> HDFS Write: 0 FAIL
>> Total MapReduce CPU Time Spent: 9 seconds 0 msec
>>
>> *syslog logs*
>>
>> utCopier.copyOutput(ReduceTask.java:1394)
>>      at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1326)
>>
>> 2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Task 
>> attempt_201301231151_0002_r_000000_0: Failed fetch #10 from 
>> attempt_201301231151_0002_m_000001_0
>> 2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Failed to 
>> fetch map-output from attempt_201301231151_0002_m_000001_0 even after 
>> MAX_FETCH_RETRIES_PER_MAP retries...  or it is a read error,  reporting to 
>> the JobTracker
>> 2013-01-23 12:15:44,885 FATAL org.apache.hadoop.mapred.ReduceTask: Shuffle 
>> failed with too many fetch failures and insufficient progress!Killing task 
>> attempt_201301231151_0002_r_000000_0.
>> 2013-01-23 12:15:44,889 WARN org.apache.hadoop.mapred.ReduceTask: 
>> attempt_201301231151_0002_r_000000_0 adding host savitha-VirtualBox to 
>> penalty box, next contact in 137 seconds
>> 2013-01-23 12:15:44,889 INFO org.apache.hadoop.mapred.ReduceTask: 
>> attempt_201301231151_0002_r_000000_0: Got 1 map-outputs from previous 
>> failures
>> 2013-01-23 12:15:45,218 FATAL org.apache.hadoop.mapred.Task: 
>> attempt_201301231151_0002_r_000000_0 GetMapEventsThread Ignoring exception : 
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate 
>> Failed. Ignoring request from task: attempt_201301231151_0002_r_000000_0, 
>> with JvmId: jvm_201301231151_0002_r_1079250852
>>      at 
>> org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
>>      at 
>> org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:3537)
>>      at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>>      at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>      at java.lang.reflect.Method.invoke(Method.java:601)
>>      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>>      at java.security.AccessController.doPrivileged(Native Method)
>>      at javax.security.auth.Subject.doAs(Subject.java:415)
>>      at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
>>
>>      at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>      at $Proxy1.getMapCompletionEvents(Unknown Source)
>>      at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2846)
>>      at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2810)
>>
>> 2013-01-23 12:15:45,220 FATAL org.apache.hadoop.mapred.Task: Failed to 
>> contact the tasktracker
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate 
>> Failed. Ignoring request from task: attempt_201301231151_0002_r_000000_0, 
>> with JvmId: jvm_201301231151_0002_r_1079250852
>>      at 
>> org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
>>      at 
>> org.apache.hadoop.mapred.TaskTracker.fatalError(TaskTracker.java:3520)
>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>      at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>      at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>      at java.lang.reflect.Method.invoke(Method.java:601)
>>      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>>      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>>      at java.security.AccessController.doPrivileged(Native Method)
>>      at javax.security.auth.Subject.doAs(Subject.java:415)
>>      at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
>>
>>      at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>      at $Proxy1.fatalError(Unknown Source)
>>      at org.apache.hadoop.mapred.Task.reportFatalError(Task.java:298)
>>      at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2829)
>>
>> thanks,
>> Venkat
>>
>> -------- Original Message --------  Subject: Re: Configure Hive in
>> Cluster  Date: Thu, 17 Jan 2013 17:23:03 +0530  From: venkatramanan
>> <venkatraman...@smartek21.com> <venkatraman...@smartek21.com>  Reply-To:
>> <user@hive.apache.org> <user@hive.apache.org>  To: 
>> <user@hive.apache.org><user@hive.apache.org>
>>
>>
>> Can you suggest me the mandatory hive parameters and clustering
>> configuration steps
>>
>> On Thursday 17 January 2013 12:56 PM, Nitin Pawar wrote:
>>
>> looks like a very small cluster with very limited memory to run mapreduce
>> jobs also number of map/reduce slots on nodes are less so at a time only
>> one map is running.
>>
>>  but still 15 min is a lot of time for 600MB memory
>>
>>
>> On Thu, Jan 17, 2013 at 12:47 PM, venkatramanan <
>> venkatraman...@smartek21.com> wrote:
>>
>>>  Below details are the cluster configuration
>>>
>>> Configured Capacity         : 82.8 GB
>>> DFS Used                          : 1.16 GB
>>> Non DFS Used                  : 31.95 GB
>>> DFS Remaining                : 49.69 GB
>>> DFS Used%                      : 1.4 %
>>> DFS Remaining%              : 60.01 %
>>> Live Nodes <http://localhost:50070/dfsnodelist.jsp?whatNodes=LIVE>
>>>                   : 2
>>> Dead Nodes <http://localhost:50070/dfsnodelist.jsp?whatNodes=DEAD>
>>>                 : 0
>>> Decommissioning 
>>> Nodes<http://localhost:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>: 0
>>> Number of Under-Replicated Blocks : 0
>>>
>>> My Select Query is:
>>>
>>> "select * from tweet where Id = 810;"
>>>
>>> This query takes 15 min to complete
>>>
>>>
>>>
>>> On Thursday 17 January 2013 12:29 PM, Nitin Pawar wrote:
>>>
>>> how many number of nodes you have for select query?
>>> whats your select query?
>>>
>>>  if its just a select * from table then it does not run any mapreduce
>>> job
>>>  so its just taking time to show data on your screen if you are using
>>> that query
>>>
>>>
>>> On Thu, Jan 17, 2013 at 12:24 PM, venkatramanan <
>>> venkatraman...@smartek21.com> wrote:
>>>
>>>>  I didnt set any hive parameters and my total table size is 610 MB only
>>>>
>>>>
>>>>
>>>> On Thursday 17 January 2013 12:11 PM, Nitin Pawar wrote:
>>>>
>>>> a bit more details on size of table and select query will help
>>>> also did you set any hive parameters ?
>>>>
>>>>
>>>> On Thu, Jan 17, 2013 at 12:12 PM, venkatramanan <
>>>> venkatraman...@smartek21.com> wrote:
>>>>
>>>>>  Hi All,
>>>>>
>>>>> Am Newbie in apache hive. I have create a table and thats points to
>>>>> the HDFS Folder path and its takes 15 min to execute the simple "*
>>>>> select*" stmt, Can anyone suggest me for a best practices and
>>>>> performance improvement on hive.
>>>>>
>>>>> Thanks in Advance
>>>>>
>>>>> Venkat
>>>>>
>>>>
>>>>
>>>>
>>>>  --
>>>> Nitin Pawar
>>>>
>>>>
>>>>
>>>
>>>
>>>  --
>>> Nitin Pawar
>>>
>>>
>>>
>>
>>
>>  --
>> Nitin Pawar
>>
>>
>>
>>
>>
>
>
>  --
> Nitin Pawar
>
>
>
> --
>
>


-- 
Nitin Pawar

Re: Configure Hive in Cluster

Reply via email to