RE: Hive on Spark not working

2016-11-29 Thread Joaquin Alzola
Being unable to integrate separately Hive with Spark I just started directly on 
Spark the thrift server.
Now it is working as expected.

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: 29 November 2016 11:12
To: user 
Subject: Re: Hive on Spark not working

Hive on Spark engine only works with Spark 1.3.1.


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 29 November 2016 at 07:56, Furcy Pin 
> wrote:
ClassNotFoundException generally means that jars are missing from your class 
path.

You probably need to link the spark jar to $HIVE_HOME/lib
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-ConfiguringHive

On Tue, Nov 29, 2016 at 2:03 AM, Joaquin Alzola 
> wrote:
Hi Guys

No matter what I do that when I execute “select count(*) from employee” I get 
the following output on the logs:
It is quiet funny because if I put hive.execution.engine=mr the output is 
correct. If I put hive.execution.engine=spark then I get the bellow errors.
If I do the search directly through spark-shell it work great.
+---+
|_c0|
+---+
|1005635|
+---+
So there has to be a problem from hive to spark.

Seems as the RPC(??) connection is not setup …. Can somebody guide me on what 
to look for.
spark.master=spark://172.16.173.31:7077
hive.execution.engine=spark
spark.executor.extraClassPath
/mnt/spark/lib/spark-1.6.2-yarn-shuffle.jar:/mnt/hive/lib/hive-exec-2.0.1.jar

Hive2.0.1--> Spark 1.6.2 –> Hadoop – 2.6.5 --> Scala 2.10

2016-11-29T00:35:11,099 WARN  [RPC-Handler-2]: rpc.RpcDispatcher 
(RpcDispatcher.java:handleError(142)) - Received error 
message:io.netty.handler.codec.DecoderException: 
java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job
at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:358)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:230)
at 
io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:103)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at 

Re: Problems with Hive Streaming. Compactions not working. Out of memory errors.

2016-11-29 Thread Eugene Koifman
The OOM is most likely a side effect of not running compactions.
W/o compactions you never reduce the number of delta file that
need to be loaded to materialize the data set on read.

On 11/29/16, 10:03 AM, "Alan Gates"  wrote:

>I¹m guessing that this is an issue in the metastore database where it is
>unable to read from the transaction tables due to the ingestion rate.
>What version of Hive are you using?  What database are you storing the
>metadata in?
>
>Alan.
>
>> On Nov 29, 2016, at 00:05, Diego Fustes Villadóniga 
>>wrote:
>> 
>> Hi all,
>>  
>> We are trying to use Hive streaming to ingest data in real time from
>>Flink. We send batches of data every 5 seconds to Hive. We are working
>>version 1.1.0-cdh5.8.2.
>>  
>> The ingestión works fine. However, compactions are not working, the log
>>shows this error:
>>  
>> Unable to select next element for compaction, ERROR: could not
>>serialize access due to concurrent update
>>  
>> In addition, when we run simple queries like SELECT COUNT(1) FROM
>>events, we are getting OutOfMemory errors, even though we have assigned
>>10GB to each Mapper/Reducer. Seeing the logs, each map task tries to load
>> all delta files, until it breaks, which does not make much sense to me.
>>  
>>  
>> I think that we have followed all the steps described in the
>>documentation, so we are blocked in this point.
>>  
>> Could you help us?
>
>



Re: Problems with Hive Streaming. Compactions not working. Out of memory errors.

2016-11-29 Thread Alan Gates
I’m guessing that this is an issue in the metastore database where it is unable 
to read from the transaction tables due to the ingestion rate.  What version of 
Hive are you using?  What database are you storing the metadata in?

Alan.

> On Nov 29, 2016, at 00:05, Diego Fustes Villadóniga  wrote:
> 
> Hi all,
>  
> We are trying to use Hive streaming to ingest data in real time from Flink. 
> We send batches of data every 5 seconds to Hive. We are working version 
> 1.1.0-cdh5.8.2.
>  
> The ingestión works fine. However, compactions are not working, the log shows 
> this error:
>  
> Unable to select next element for compaction, ERROR: could not serialize 
> access due to concurrent update
>  
> In addition, when we run simple queries like SELECT COUNT(1) FROM events, we 
> are getting OutOfMemory errors, even though we have assigned 10GB to each 
> Mapper/Reducer. Seeing the logs, each map task tries to load
> all delta files, until it breaks, which does not make much sense to me.
>  
>  
> I think that we have followed all the steps described in the documentation, 
> so we are blocked in this point.
>  
> Could you help us?



Re: Question about partition pruning when there's a type mismatch

2016-11-29 Thread Anthony Hsu
Thanks for the tips, Gopal. I stepped through the code in a debugger and
found that in the case of String = String, the predicate was pushed down to
the SQL query on the metastore side, whereas in the case of String = Int,
the SQL filter pushdown failed, so GenericUDFOPEqual gets evaluated and
returns null, in which case the PartitionPruner treats the value of the
predicate as unknown and returns all partitions.

On Mon, Nov 28, 2016 at 3:04 PM, Gopal Vijayaraghavan 
wrote:

>
> > I'm wondering why Hive tries to scan all partitions when the quotes are
> omitted. Without the quotes, shouldn't 2016-11-28-00 get evaluated as an
> arithmetic expression, then get cast to a string, and then partitioning
> pruning still occur?
>
> The order of evaluation is different - String = Integer becomes
> UDFToDouble(String) = UDFToDouble(Integer) (because that keeps the >=
> behavior consistent with =).
>
> The version you're running is very relevant here.
>
> Not all versions of hive have a constant folding optimization & even with
> that, only recent versions of hive perform partition pruning when the
> partition column is wrapped in a UDF.
>
> Posting the output of an "explain " would also help.
>
> Cheers,
> Gopal
>
>
>


RE: Need error logging advice on batch processing

2016-11-29 Thread Brotanek, Jan
Usually if DDL fails, query is displayed one line above line with keyword Failed

bash> cat CTASchybne2.log | grep -n -B 1 FAILED
4-create table SCHEMA.AAA stored as orc as select * from SCHEMA.AAA
5:FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
Table already exists: SCHEMA.AAA
--

From: neelima g [mailto:nling...@gmail.com]
Sent: pondělí 28. listopadu 2016 18:44
To: user@hive.apache.org
Subject: Re: Need error logging advice on batch processing

Brotanek,

hive -f file.txt hiveconf hive.cli.errors.ignore=true 2&> file.log
and then grep for keywords *Exception*

Have you ever tried capturing what all hive queries failed i.e entire query.I 
am looking for line number of query in file or queries failed?

Errors are in the format like :
 FAILED: SemanticException [Error 10072]: Database does not exist: x

Please suggest if anyone know solution.

Neelima


On Mon, Nov 28, 2016 at 7:59 AM, Brotanek, Jan 
> wrote:
Hello,

you can log like this:

hive -f file.txt hiveconf hive.cli.errors.ignore=true 2&> file.log
and then grep for keywords *Exception*


From: neelima g [mailto:nling...@gmail.com]
Sent: pondělí 28. listopadu 2016 16:15
To: user@hive.apache.org
Subject: Need error logging advice on batch processing

Hi,

I want to run batch mode task like and i am using hiveconf 
hive.cli.errors.ignore=true

hive -f file.txt hiveconf hive.cli.errors.ignore=true

And my requirement is find what all queries failed after execution. Is that 
possible ? And i have create statements in the queries..it is DDL.




RE: Hive on Spark not working

2016-11-29 Thread Joaquin Alzola
HI Mich

I read in some older post that you make it work as well with the configuration 
I have:
Hive2.0.1--> Spark 1.6.2 –> Hadoop – 2.6.5 --> Scala 2.10
You only make it work with Hive 1.2.1 --> Spark 1.3.1 --> etc ….?

BR

Joaquin

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: 29 November 2016 11:12
To: user 
Subject: Re: Hive on Spark not working

Hive on Spark engine only works with Spark 1.3.1.


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 29 November 2016 at 07:56, Furcy Pin 
> wrote:
ClassNotFoundException generally means that jars are missing from your class 
path.

You probably need to link the spark jar to $HIVE_HOME/lib
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-ConfiguringHive

On Tue, Nov 29, 2016 at 2:03 AM, Joaquin Alzola 
> wrote:
Hi Guys

No matter what I do that when I execute “select count(*) from employee” I get 
the following output on the logs:
It is quiet funny because if I put hive.execution.engine=mr the output is 
correct. If I put hive.execution.engine=spark then I get the bellow errors.
If I do the search directly through spark-shell it work great.
+---+
|_c0|
+---+
|1005635|
+---+
So there has to be a problem from hive to spark.

Seems as the RPC(??) connection is not setup …. Can somebody guide me on what 
to look for.
spark.master=spark://172.16.173.31:7077
hive.execution.engine=spark
spark.executor.extraClassPath
/mnt/spark/lib/spark-1.6.2-yarn-shuffle.jar:/mnt/hive/lib/hive-exec-2.0.1.jar

Hive2.0.1--> Spark 1.6.2 –> Hadoop – 2.6.5 --> Scala 2.10

2016-11-29T00:35:11,099 WARN  [RPC-Handler-2]: rpc.RpcDispatcher 
(RpcDispatcher.java:handleError(142)) - Received error 
message:io.netty.handler.codec.DecoderException: 
java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job
at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:358)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:230)
at 
io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:103)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
at 

Re: Issues regarding HPLSQL tool

2016-11-29 Thread Dmitry Tolpeko
Ainhoa,

Did you provide correct username and password to connect to Hive server?
The problem I see is that it cannot launch a MapReduce job. SELECT * FROM
tab does not require a MR job so it works fine, while when you add WHERE
clause it fails.

Thanks,
Dmitry

On Tue, Nov 29, 2016 at 2:33 PM, Ainhoa Benitez 
wrote:

> This is the error log:
>
>
> [training@miguel ~]$ hplsql -e "SELECT * FROM movies where movieid <5";
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/cloudera/
> parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/slf4j-log4j12-1.7.
> 5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/cloudera/
> parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/pig-0.12.0-cdh5.9.
> 0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/cloudera/
> parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/slf4j-simple-1.7.5.
> jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/cloudera/
> parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/avro-tools-1.7.6-
> cdh5.9.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/11/29 10:49:26 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/11/29 10:49:26 INFO jdbc.Utils: Resolved authority: localhost:1
> Open connection: jdbc:hive2://localhost:1 (690 ms)
> Starting query
> Unhandled exception in HPL/SQL
> java.sql.SQLException: Error while processing statement: FAILED: Execution
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at org.apache.hive.jdbc.HiveStatement.execute(
> HiveStatement.java:279)
> at org.apache.hive.jdbc.HiveStatement.executeQuery(
> HiveStatement.java:375)
> at org.apache.hive.hplsql.Conn.executeQuery(Conn.java:63)
> at org.apache.hive.hplsql.Exec.executeQuery(Exec.java:554)
> at org.apache.hive.hplsql.Exec.executeQuery(Exec.java:563)
> at org.apache.hive.hplsql.Select.select(Select.java:74)
> at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:993)
> at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:51)
> at org.apache.hive.hplsql.HplsqlParser$Select_stmtContext.accept(
> HplsqlParser.java:14249)
> at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.
> visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:985)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:51)
> at org.apache.hive.hplsql.HplsqlParser$StmtContext.
> accept(HplsqlParser.java:998)
> at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.
> visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(
> HplsqlBaseVisitor.java:28)
> at org.apache.hive.hplsql.HplsqlParser$BlockContext.
> accept(HplsqlParser.java:438)
> at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.
> visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:893)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:51)
> at org.apache.hive.hplsql.HplsqlParser$ProgramContext.
> accept(HplsqlParser.java:381)
> at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(
> AbstractParseTreeVisitor.java:42)
> at org.apache.hive.hplsql.Exec.run(Exec.java:753)
> at org.apache.hive.hplsql.Exec.run(Exec.java:729)
> at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
> [training@miguel ~]$
> [training@miguel ~]$ ssh bea
> The authenticity of host 'bea (10.164.79.119)' can't be established.
> RSA key fingerprint is f8:ce:3a:a0:92:23:3d:e7:3f:e1:42:50:4b:17:de:7d.
> Are you sure you want to continue connecting (yes/no)? yes
> Warning: Permanently added 'bea,10.164.79.119' (RSA) to the list of known
> hosts.
> Last login: Mon Nov 28 17:50:38 2016 from 10.164.77.156
>
>
> 2016-11-29 12:31 GMT+01:00 Dmitry Tolpeko :
>
>> Please post as text message to user@ list.
>>
>> On Tue, Nov 29, 2016 at 2:02 PM, Ainhoa Benitez > > wrote:
>>
>>> Hello Dmitry,
>>>
>>> I attach a screenshot with the error (On the top is the query I executed)
>>>
>>> Thanks
>>>
>>> Ainhoa
>>>
>>>
>>> 2016-11-29 11:30 GMT+01:00 Dmitry Tolpeko :
>>>
 Ainhoa,

 Can you please post the entire script? Also try to add --trace option
 to see which query was actually executed in the database.

 Thanks,
 Dmitry

 On Tue, Nov 29, 2016 at 11:44 AM, Ainhoa Benitez <
 abeni...@corenetworks.es> wrote:

> Good morning,
>
> my name is Ainhoa. I am starting to use the HPLSQL tool and up to now
> it was working fine. However, my issue has to do when using a simple where

Re: Hive on Spark not working

2016-11-29 Thread Mich Talebzadeh
Hive on Spark engine only works with Spark 1.3.1.

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 29 November 2016 at 07:56, Furcy Pin  wrote:

> ClassNotFoundException generally means that jars are missing from your
> class path.
>
> You probably need to link the spark jar to $HIVE_HOME/lib
> https://cwiki.apache.org/confluence/display/Hive/Hive+
> on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-ConfiguringHive
>
> On Tue, Nov 29, 2016 at 2:03 AM, Joaquin Alzola  > wrote:
>
>> Hi Guys
>>
>>
>>
>> No matter what I do that when I execute “select count(*) from employee” I
>> get the following output on the logs:
>>
>> It is quiet funny because if I put hive.execution.engine=mr the output is
>> correct. If I put hive.execution.engine=spark then I get the bellow errors.
>>
>> If I do the search directly through spark-shell it work great.
>>
>> +---+
>>
>> |_c0|
>>
>> +---+
>>
>> |1005635|
>>
>> +---+
>>
>> So there has to be a problem from hive to spark.
>>
>>
>>
>> Seems as the RPC(??) connection is not setup …. Can somebody guide me on
>> what to look for.
>>
>> spark.master=spark://172.16.173.31:7077
>>
>> hive.execution.engine=spark
>>
>> spark.executor.extraClassPath/mnt/spark/lib/spark-1.6.2-yar
>> n-shuffle.jar:/mnt/hive/lib/hive-exec-2.0.1.jar
>>
>>
>>
>> Hive2.0.1à Spark 1.6.2 –> Hadoop – 2.6.5 à Scala 2.10
>>
>>
>>
>> 2016-11-29T00:35:11,099 WARN  [RPC-Handler-2]: rpc.RpcDispatcher
>> (RpcDispatcher.java:handleError(142)) - Received error
>> message:io.netty.handler.codec.DecoderException:
>> java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job
>>
>> at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteT
>> oMessageDecoder.java:358)
>>
>> at io.netty.handler.codec.ByteToMessageDecoder.channelRead(Byte
>> ToMessageDecoder.java:230)
>>
>> at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteTo
>> MessageCodec.java:103)
>>
>> at io.netty.channel.AbstractChannelHandlerContext.invokeChannel
>> Read(AbstractChannelHandlerContext.java:308)
>>
>> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRe
>> ad(AbstractChannelHandlerContext.java:294)
>>
>> at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(Ch
>> annelInboundHandlerAdapter.java:86)
>>
>> at io.netty.channel.AbstractChannelHandlerContext.invokeChannel
>> Read(AbstractChannelHandlerContext.java:308)
>>
>> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRe
>> ad(AbstractChannelHandlerContext.java:294)
>>
>> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(Defa
>> ultChannelPipeline.java:846)
>>
>> at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.
>> read(AbstractNioByteChannel.java:131)
>>
>> at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEven
>> tLoop.java:511)
>>
>> at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimiz
>> ed(NioEventLoop.java:468)
>>
>> at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEve
>> ntLoop.java:382)
>>
>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>
>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(
>> SingleThreadEventExecutor.java:111)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/J
>> ob
>>
>> at java.lang.ClassLoader.defineClass1(Native Method)
>>
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>>
>> at java.security.SecureClassLoader.defineClass(SecureClassLoade
>> r.java:142)
>>
>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>
>> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>
>> at 

Re: Issues regarding HPLSQL tool

2016-11-29 Thread Dmitry Tolpeko
Ainhoa,

Can you please post the entire script? Also try to add --trace option to
see which query was actually executed in the database.

Thanks,
Dmitry

On Tue, Nov 29, 2016 at 11:44 AM, Ainhoa Benitez 
wrote:

> Good morning,
>
> my name is Ainhoa. I am starting to use the HPLSQL tool and up to now it
> was working fine. However, my issue has to do when using a simple where
> clause, does this tool accept to write a where clause?
>
> For example, the problem is the following:
>
> -I have a table named "test" with several fields. I make a select clause
> such as: "select * from test limit 3" and it displays the result with no
> result. I try to make another query selecting particular fields of the
> table and no problem.
>
> Nevertheless, the issue comes when I try to do some of the following
> queries:
>
> -Select count(*) from test;
>
> -Select * from test where movieid <5;
>
> I do not have any idea why the problem keeps occuring or how to solve it.
> Is there any possibility to make a query with where conditions? If so, how?
> And about count all fields query, what  could the problem be?
>
> Thanks so much for your help!!!
>
> Ainhoa
>
> Segun el Articulo 5 de la L.O.P.D, le informamos que sus datos constan en
> un fichero titularidad de CORE NETWORKS, S.L., cuya finalidad es la gestion
> administrativa. Podra ejercer su derecho de acceso, rectificacion,
> cancelacion y oposicion mediante correo postal a C/ Serrano Galvache, 56,
> Edificio Olmo, 1 Planta - C.P. 28033 (MADRID), o enviando un correo
> electrónico a i...@corenetworks.es.
>


Reg:Sqoop Import-Oracle to Hive-Parquet

2016-11-29 Thread kishore kumar
Hi Experts,

We are trying to use parquet for importing data from oracle to hive, we are
encountering the below error, could anyone help me to resolve this issue ?

We are using sqoop version 1.4.6 and hive version 1.2.

Error:



16/11/28 21:21:46 INFO hive.metastore: Connected to metastore.

16/11/28 21:21:46 ERROR tool.ImportTool: Imported Failed: Cannot convert
unsupported type: timestamp


Thanks,

Kishore Kumar.

-- 
Thanks,
Kishore.


Problems with Hive Streaming. Compactions not working. Out of memory errors.

2016-11-29 Thread Diego Fustes Villadóniga
Hi all,

We are trying to use Hive streaming to ingest data in real time from Flink. We 
send batches of data every 5 seconds to Hive. We are working version 
1.1.0-cdh5.8.2.

The ingestión works fine. However, compactions are not working, the log shows 
this error:

Unable to select next element for compaction, ERROR: could not serialize access 
due to concurrent update

In addition, when we run simple queries like SELECT COUNT(1) FROM events, we 
are getting OutOfMemory errors, even though we have assigned 10GB to each 
Mapper/Reducer. Seeing the logs, each map task tries to load
all delta files, until it breaks, which does not make much sense to me.


I think that we have followed all the steps described in the documentation, so 
we are blocked in this point.

Could you help us?