My understanding is that people on this mailing list who are interested to
help can log comments on the GORA JIRA.
HBase integration with Spark is proven to work. So the intricacies should
be on Gora side.

On Wed, Aug 26, 2015 at 8:08 AM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> Btw, here is the source code of GoraInputFormat.java :
>
>
> https://github.com/kamaci/gora/blob/master/gora-core/src/main/java/org/apache/gora/mapreduce/GoraInputFormat.java
> 26 Ağu 2015 18:05 tarihinde "Furkan KAMACI" <furkankam...@gmail.com>
> yazdı:
>
> I'll send an e-mail to Gora dev list too and also attach my patch into my
>> GSoC Jira issue you mentioned and then we can continue at there.
>>
>> Before I do that stuff, I wanted to get Spark dev community's ideas to
>> solve my problem due to you may have faced such kind of problems before.
>> 26 Ağu 2015 17:13 tarihinde "Ted Yu" <yuzhih...@gmail.com> yazdı:
>>
>>> I found GORA-386 Gora Spark Backend Support
>>>
>>> Should the discussion be continued there ?
>>>
>>> Cheers
>>>
>>> On Wed, Aug 26, 2015 at 7:02 AM, Ted Malaska <ted.mala...@cloudera.com>
>>> wrote:
>>>
>>>> Where is the input format class.  When every I use the search on your
>>>> github it says "We couldn’t find any issues matching 'GoraInputFormat'"
>>>>
>>>>
>>>>
>>>> On Wed, Aug 26, 2015 at 9:48 AM, Furkan KAMACI <furkankam...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Here is the MapReduceTestUtils.testSparkWordCount()
>>>>>
>>>>>
>>>>> https://github.com/kamaci/gora/blob/master/gora-core/src/test/java/org/apache/gora/mapreduce/MapReduceTestUtils.java#L108
>>>>>
>>>>> Here is SparkWordCount
>>>>>
>>>>>
>>>>> https://github.com/kamaci/gora/blob/8f1acc6d4ef6c192e8fc06287558b7bc7c39b040/gora-core/src/examples/java/org/apache/gora/examples/spark/SparkWordCount.java
>>>>>
>>>>> Lastly, here is GoraSparkEngine:
>>>>>
>>>>>
>>>>> https://github.com/kamaci/gora/blob/master/gora-core/src/main/java/org/apache/gora/spark/GoraSparkEngine.java
>>>>>
>>>>> Kind Regards,
>>>>> Furkan KAMACI
>>>>>
>>>>> On Wed, Aug 26, 2015 at 4:40 PM, Ted Malaska <ted.mala...@cloudera.com
>>>>> > wrote:
>>>>>
>>>>>> Where can I find the code for MapReduceTestUtils.testSparkWordCount?
>>>>>>
>>>>>> On Wed, Aug 26, 2015 at 9:29 AM, Furkan KAMACI <
>>>>>> furkankam...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Here is the test method I've ignored due to Connection Refused
>>>>>>> problem failure:
>>>>>>>
>>>>>>>
>>>>>>> https://github.com/kamaci/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/mapreduce/TestHBaseStoreWordCount.java#L65
>>>>>>>
>>>>>>> I've implemented a Spark backend for Apache Gora as GSoC project and
>>>>>>> this is the latest obstacle that I should solve. If you can help me, you
>>>>>>> are welcome.
>>>>>>>
>>>>>>> Kind Regards,
>>>>>>> Furkan KAMACI
>>>>>>>
>>>>>>> On Wed, Aug 26, 2015 at 3:45 PM, Ted Malaska <
>>>>>>> ted.mala...@cloudera.com> wrote:
>>>>>>>
>>>>>>>> I've always used HBaseTestingUtility and never really had much
>>>>>>>> trouble. I use that for all my unit testing between Spark and HBase.
>>>>>>>>
>>>>>>>> Here are some code examples if your interested
>>>>>>>>
>>>>>>>> --Main HBase-Spark Module
>>>>>>>> https://github.com/apache/hbase/tree/master/hbase-spark
>>>>>>>>
>>>>>>>> --Unit test that cover all basic connections
>>>>>>>>
>>>>>>>> https://github.com/apache/hbase/blob/master/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/HBaseContextSuite.scala
>>>>>>>>
>>>>>>>> --If you want to look at the old stuff before it went into HBase
>>>>>>>> https://github.com/cloudera-labs/SparkOnHBase
>>>>>>>>
>>>>>>>> Let me know if that helps
>>>>>>>>
>>>>>>>> On Wed, Aug 26, 2015 at 5:40 AM, Ted Yu <yuzhih...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Can you log the contents of the Configuration you pass from Spark ?
>>>>>>>>> The output would give you some clue.
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Aug 26, 2015, at 2:30 AM, Furkan KAMACI <furkankam...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Ted,
>>>>>>>>>
>>>>>>>>> I'll check Zookeeper connection but another test method which runs
>>>>>>>>> on hbase without Spark works without any error. Hbase version is
>>>>>>>>> 0.98.8-hadoop2 and I use Spark 1.3.1
>>>>>>>>>
>>>>>>>>> Kind Regards,
>>>>>>>>> Furkan KAMACI
>>>>>>>>> 26 Ağu 2015 12:08 tarihinde "Ted Yu" <yuzhih...@gmail.com> yazdı:
>>>>>>>>>
>>>>>>>>>> The connection failure was to zookeeper.
>>>>>>>>>>
>>>>>>>>>> Have you verified that localhost:2181 can serve requests ?
>>>>>>>>>> What version of hbase was Gora built against ?
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Aug 26, 2015, at 1:50 AM, Furkan KAMACI <
>>>>>>>>>> furkankam...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I start an Hbase cluster for my test class. I use that helper
>>>>>>>>>> class:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://github.com/apache/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/util/HBaseClusterSingleton.java
>>>>>>>>>>
>>>>>>>>>> and use it as like that:
>>>>>>>>>>
>>>>>>>>>> private static final HBaseClusterSingleton cluster =
>>>>>>>>>> HBaseClusterSingleton.build(1);
>>>>>>>>>>
>>>>>>>>>> I retrieve configuration object as follows:
>>>>>>>>>>
>>>>>>>>>> cluster.getConf()
>>>>>>>>>>
>>>>>>>>>> and I use it at Spark as follows:
>>>>>>>>>>
>>>>>>>>>> sparkContext.newAPIHadoopRDD(conf, MyInputFormat.class, clazzK,
>>>>>>>>>>     clazzV);
>>>>>>>>>>
>>>>>>>>>> When I run my test there is no need to startup an Hbase cluster
>>>>>>>>>> because Spark will connect to my dummy cluster. However when I run 
>>>>>>>>>> my test
>>>>>>>>>> method it throws an error:
>>>>>>>>>>
>>>>>>>>>> 2015-08-26 01:19:59,558 INFO [Executor task launch
>>>>>>>>>> worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn
>>>>>>>>>> (ClientCnxn.java:logStartConnect(966)) - Opening socket connection to
>>>>>>>>>> server localhost/127.0.0.1:2181. Will not attempt to
>>>>>>>>>> authenticate using SASL (unknown error)
>>>>>>>>>>
>>>>>>>>>> 2015-08-26 01:19:59,559 WARN [Executor task launch
>>>>>>>>>> worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn
>>>>>>>>>> (ClientCnxn.java:run(1089)) - Session 0x0 for server null, unexpected
>>>>>>>>>> error, closing socket connection and attempting reconnect
>>>>>>>>>> java.net.ConnectException: Connection refused at
>>>>>>>>>> sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at
>>>>>>>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>>>>>>>>  at
>>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>>>>>>>>>> at 
>>>>>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>>>>>>> Hbase tests, which do not run on Spark, works well. When I check
>>>>>>>>>> the logs I see that cluster and Spark is started up correctly:
>>>>>>>>>>
>>>>>>>>>> 2015-08-26 01:35:21,791 INFO [main] hdfs.MiniDFSCluster
>>>>>>>>>> (MiniDFSCluster.java:waitActive(2055)) - Cluster is active
>>>>>>>>>>
>>>>>>>>>> 2015-08-26 01:35:40,334 INFO [main] util.Utils
>>>>>>>>>> (Logging.scala:logInfo(59)) - Successfully started service 
>>>>>>>>>> 'sparkDriver' on
>>>>>>>>>> port 56941.
>>>>>>>>>> I realized that when I start up an hbase from command line my
>>>>>>>>>> test method for Spark connects to it!
>>>>>>>>>>
>>>>>>>>>> So, does it means that it doesn't care about the conf I passed to
>>>>>>>>>> it? Any ideas about how to solve it?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>

Reply via email to