Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Josh Mahonin
Hi Babar, Can you file a JIRA for this? I suspect this is something to do with the Spark 1.5 data frame API data structures, perhaps they've gone and changed them again! Can you try with previous Spark versions to see if there's a difference? Also, you may have luck interfacing with the RDDs

Re: BulkloadTool issue even after successful HfileLoads

2015-09-23 Thread Gabriel Reid
Hi Dhruv, This is a bug in Phoenix, although it appears that your hadoop configuration is also somewhat unusual. As far as I can see, your hadoop configuration is set up to use the local filesystem, and not hdfs. You can test this by running the following command: hadoop dfs -ls / If that

BulkloadTool issue even after successful HfileLoads

2015-09-23 Thread Dhruv Gohil
Hi, I am able to successfully use BulkLoadTool to load millions of rows in phoenixTable, but at end of each execution following error occurs. Need your inputs to make runs full green. Following is minimal reproduction using EXAMPLE given in documentation. Env: CDH 5.3 Hbase

Re: BulkloadTool issue even after successful HfileLoads

2015-09-23 Thread Dhruv Gohil
Thanks Gabriel for quick response, You are right about my Hadoop Config, its somehow using local filesystem. (let me find in CDH on how to change this) So for tool "--output" is optional arg, so should work without that argument. But in that case we are not at all able to

Re: BulkloadTool issue even after successful HfileLoads

2015-09-23 Thread Gabriel Reid
That's correct, without having the output written to HDFS, HBase won't be able to load the HFiles unless HBase is also hosted on the local filesystem. In any case, properly configuring Hadoop to use the same HDFS setup everywhere will be the easiest way to get things working. - Gabriel On Wed,

Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Babar Tareen
I have filed PHOENIX-2287 for this. And the code works fine with Spark 1.4.1. Thanks On Wed, Sep 23, 2015 at 6:06 AM Josh Mahonin wrote: > Hi Babar, > > Can you file a JIRA for this? I suspect this is something to do

Re: Setting a TTL in an upsert

2015-09-23 Thread James Taylor
Hi Alex, I can think of a couple of ways to support this: 1) Surface support for per Cell TTLs (HBASE-10560) in Phoenix (PHOENIX-1335). This could have the kind of syntax you mentioned (or alternatively rely on a connection property and no syntactic change would be necessary, and then in

Re: Setting a TTL in an upsert

2015-09-23 Thread James Taylor
Also, for more information on (2), see https://phoenix.apache.org/faq.html#Can_phoenix_work_on_tables_with_arbitrary_timestamp_as_flexible_as_HBase_API On Wed, Sep 23, 2015 at 10:55 AM, James Taylor wrote: > Hi Alex, > I can think of a couple of ways to support this: >

Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Josh Mahonin
I've got a patch attached to the ticket that I think should fix your issue. If you're able to try it out and let us know how it goes, it'd be much appreciated. From: Babar Tareen Reply-To: "user@phoenix.apache.org" Date: Wednesday, September 23, 2015 at 1:14 PM

RE: Setting a TTL in an upsert

2015-09-23 Thread Alex Loffler
Hi James, Thank you for the info/validation. What I had in mind was the ability to define the (HBase per cell) timestamp arbitrarily for each upsert statement, but more broadly there seems to be at least three levels of granularity: 1) Per cell/column timestamp – where each column

Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Babar Tareen
I tried the patch; it resolves this issue. Thanks. On Wed, Sep 23, 2015 at 11:23 AM Josh Mahonin wrote: > I've got a patch attached to the ticket that I think should fix your > issue. > > If you're able to try it out and let us know how it goes, it'd be much >

Setting a TTL in an upsert

2015-09-23 Thread Alex Loffler
Hi, Is it possible to define the TTL of a row (or even each cell in the row) during an upsert e.g: upsert into test values(1,2,3) TTL=1442988643355; Assuming the table has a TTL this would allow per-row retention policies (with automatic garbage-collection by HBase) by e.g. setting the

Re: Setting a TTL in an upsert

2015-09-23 Thread James Taylor
Thanks, Alex. I agree - starting with (3) would be best as we wouldn't need any non standard SQL syntax. On Wed, Sep 23, 2015 at 12:06 PM, Alex Loffler wrote: > Hi James, > > > > Thank you for the info/validation. What I had in mind was the ability to > define the (HBase