Re: Configure Hive in Cluster

venkatramanan Tue, 22 Jan 2013 23:55:31 -0800

No, all the nodes are up and running. i dont know, when hive takes theother nodes "HOST NAME" thats the error i guess..


revert me if am wrong


On Wednesday 23 January 2013 01:07 PM, Nitin Pawar wrote:

when you ran the query, did the VM shutdown ?

On Wed, Jan 23, 2013 at 12:57 PM, venkatramanan<venkatraman...@smartek21.com <mailto:venkatraman...@smartek21.com>>wrote:


    Hi,

    I got the following error while executing the "select count(1)
    from tweettrend;"

    Below are the exact log msg from the jobtracker Web Interface

    *Hive Cli Error:*

    Exception in thread "Thread-21" java.lang.RuntimeException: Error
    while reading from task log url
        at
    
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240)
        at
    
org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227)
        at
    org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
        at java.lang.Thread.run(Thread.java:722)
    Caused by: java.net.UnknownHostException: savitha-VirtualBox
        at
    java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
        at java.net.Socket.connect(Socket.java:579)
        at java.net.Socket.connect(Socket.java:528)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
        at sun.net.www.http.HttpClient.New(HttpClient.java:290)
        at sun.net.www.http.HttpClient.New(HttpClient.java:306)
        at
    
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
        at
    
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
        at
    
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
        at
    
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1299)
        at java.net.URL.openStream(URL.java:1037)
        at
    
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192)
        ... 3 more
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    MapReduce Jobs Launched:
    Job 0: Map: 2  Reduce: 1   Cumulative CPU: 9.0 sec HDFS Read:
    408671053 HDFS Write: 0 FAIL
    Total MapReduce CPU Time Spent: 9 seconds 0 msec

    *_syslog logs_*

    utCopier.copyOutput(ReduceTask.java:1394)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1326)

    2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Task 
attempt_201301231151_0002_r_000000_0: Failed fetch #10 from 
attempt_201301231151_0002_m_000001_0
    2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Failed to 
fetch map-output from attempt_201301231151_0002_m_000001_0 even after 
MAX_FETCH_RETRIES_PER_MAP retries...  or it is a read error,  reporting to the 
JobTracker
    2013-01-23 12:15:44,885 FATAL org.apache.hadoop.mapred.ReduceTask: Shuffle 
failed with too many fetch failures and insufficient progress!Killing task 
attempt_201301231151_0002_r_000000_0.
    2013-01-23 12:15:44,889 WARN org.apache.hadoop.mapred.ReduceTask: 
attempt_201301231151_0002_r_000000_0 adding host savitha-VirtualBox to penalty 
box, next contact in 137 seconds
    2013-01-23 12:15:44,889 INFO org.apache.hadoop.mapred.ReduceTask: 
attempt_201301231151_0002_r_000000_0: Got 1 map-outputs from previous failures
    2013-01-23 12:15:45,218 FATAL org.apache.hadoop.mapred.Task: 
attempt_201301231151_0002_r_000000_0 GetMapEventsThread Ignoring exception : 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate Failed. 
Ignoring request from task: attempt_201301231151_0002_r_000000_0, with JvmId: 
jvm_201301231151_0002_r_1079250852
        at 
org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
        at 
org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:3537)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

        at org.apache.hadoop.ipc.Client.call(Client.java:1070)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy1.getMapCompletionEvents(Unknown Source)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2846)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2810)

    2013-01-23 12:15:45,220 FATAL org.apache.hadoop.mapred.Task: Failed to 
contact the tasktracker
    org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate 
Failed. Ignoring request from task: attempt_201301231151_0002_r_000000_0, with 
JvmId: jvm_201301231151_0002_r_1079250852
        at 
org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
        at 
org.apache.hadoop.mapred.TaskTracker.fatalError(TaskTracker.java:3520)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

        at org.apache.hadoop.ipc.Client.call(Client.java:1070)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy1.fatalError(Unknown Source)
        at org.apache.hadoop.mapred.Task.reportFatalError(Task.java:298)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2829)

    thanks,
    Venkat

    -------- Original Message --------
    Subject:    Re: Configure Hive in Cluster
    Date:       Thu, 17 Jan 2013 17:23:03 +0530
    From:       venkatramanan <venkatraman...@smartek21.com>
    <mailto:venkatraman...@smartek21.com>
    Reply-To:   <user@hive.apache.org> <mailto:user@hive.apache.org>
    To:         <user@hive.apache.org> <mailto:user@hive.apache.org>



    Can you suggest me the mandatory hive parameters and clustering
    configuration steps

    On Thursday 17 January 2013 12:56 PM, Nitin Pawar wrote:

    looks like a very small cluster with very limited memory to run
    mapreduce jobs also number of map/reduce slots on nodes are less
    so at a time only one map is running.

    but still 15 min is a lot of time for 600MB memory


    On Thu, Jan 17, 2013 at 12:47 PM, venkatramanan
    <venkatraman...@smartek21.com
    <mailto:venkatraman...@smartek21.com>> wrote:

        Below details are the cluster configuration

        Configured Capacity         : 82.8 GB
        DFS Used                          : 1.16 GB
        Non DFS Used                  : 31.95 GB
        DFS Remaining                : 49.69 GB
        DFS Used%                      : 1.4 %
        DFS Remaining%              : 60.01 %
        Live Nodes

<http://localhost:50070/dfsnodelist.jsp?whatNodes=LIVE>: 2

        Dead Nodes

<http://localhost:50070/dfsnodelist.jsp?whatNodes=DEAD>: 0

        Decommissioning Nodes
        <http://localhost:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
        : 0
        Number of Under-Replicated Blocks : 0

        My Select Query is:

        "select * from tweet where Id = 810;"

        This query takes 15 min to complete



        On Thursday 17 January 2013 12:29 PM, Nitin Pawar wrote:

        how many number of nodes you have for select query?
        whats your select query?

        if its just a select * from table then it does not run any
        mapreduce job
        so its just taking time to show data on your screen if you
        are using that query


        On Thu, Jan 17, 2013 at 12:24 PM, venkatramanan
        <venkatraman...@smartek21.com
        <mailto:venkatraman...@smartek21.com>> wrote:

            I didnt set any hive parameters and my total table size
            is 610 MB only



            On Thursday 17 January 2013 12:11 PM, Nitin Pawar wrote:

            a bit more details on size of table and select query
            will help
            also did you set any hive parameters ?


            On Thu, Jan 17, 2013 at 12:12 PM, venkatramanan
            <venkatraman...@smartek21.com
            <mailto:venkatraman...@smartek21.com>> wrote:

                Hi All,

                Am Newbie in apache hive. I have create a table and
                thats points to the HDFS Folder path and its takes
                15 min to execute the simple "*select*" stmt, Can
                anyone suggest me for a best practices and
                performance improvement on hive.

                Thanks in Advance

                Venkat

--Nitin Pawar







--
Nitin Pawar

--

Re: Configure Hive in Cluster

Reply via email to