I'll send an e-mail to Gora dev list too and also attach my patch into my GSoC Jira issue you mentioned and then we can continue at there.
Before I do that stuff, I wanted to get Spark dev community's ideas to solve my problem due to you may have faced such kind of problems before. 26 Ağu 2015 17:13 tarihinde "Ted Yu" <yuzhih...@gmail.com> yazdı: > I found GORA-386 Gora Spark Backend Support > > Should the discussion be continued there ? > > Cheers > > On Wed, Aug 26, 2015 at 7:02 AM, Ted Malaska <ted.mala...@cloudera.com> > wrote: > >> Where is the input format class. When every I use the search on your >> github it says "We couldn’t find any issues matching 'GoraInputFormat'" >> >> >> >> On Wed, Aug 26, 2015 at 9:48 AM, Furkan KAMACI <furkankam...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Here is the MapReduceTestUtils.testSparkWordCount() >>> >>> >>> https://github.com/kamaci/gora/blob/master/gora-core/src/test/java/org/apache/gora/mapreduce/MapReduceTestUtils.java#L108 >>> >>> Here is SparkWordCount >>> >>> >>> https://github.com/kamaci/gora/blob/8f1acc6d4ef6c192e8fc06287558b7bc7c39b040/gora-core/src/examples/java/org/apache/gora/examples/spark/SparkWordCount.java >>> >>> Lastly, here is GoraSparkEngine: >>> >>> >>> https://github.com/kamaci/gora/blob/master/gora-core/src/main/java/org/apache/gora/spark/GoraSparkEngine.java >>> >>> Kind Regards, >>> Furkan KAMACI >>> >>> On Wed, Aug 26, 2015 at 4:40 PM, Ted Malaska <ted.mala...@cloudera.com> >>> wrote: >>> >>>> Where can I find the code for MapReduceTestUtils.testSparkWordCount? >>>> >>>> On Wed, Aug 26, 2015 at 9:29 AM, Furkan KAMACI <furkankam...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> Here is the test method I've ignored due to Connection Refused problem >>>>> failure: >>>>> >>>>> >>>>> https://github.com/kamaci/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/mapreduce/TestHBaseStoreWordCount.java#L65 >>>>> >>>>> I've implemented a Spark backend for Apache Gora as GSoC project and >>>>> this is the latest obstacle that I should solve. If you can help me, you >>>>> are welcome. >>>>> >>>>> Kind Regards, >>>>> Furkan KAMACI >>>>> >>>>> On Wed, Aug 26, 2015 at 3:45 PM, Ted Malaska <ted.mala...@cloudera.com >>>>> > wrote: >>>>> >>>>>> I've always used HBaseTestingUtility and never really had much >>>>>> trouble. I use that for all my unit testing between Spark and HBase. >>>>>> >>>>>> Here are some code examples if your interested >>>>>> >>>>>> --Main HBase-Spark Module >>>>>> https://github.com/apache/hbase/tree/master/hbase-spark >>>>>> >>>>>> --Unit test that cover all basic connections >>>>>> >>>>>> https://github.com/apache/hbase/blob/master/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/HBaseContextSuite.scala >>>>>> >>>>>> --If you want to look at the old stuff before it went into HBase >>>>>> https://github.com/cloudera-labs/SparkOnHBase >>>>>> >>>>>> Let me know if that helps >>>>>> >>>>>> On Wed, Aug 26, 2015 at 5:40 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>>>>> >>>>>>> Can you log the contents of the Configuration you pass from Spark ? >>>>>>> The output would give you some clue. >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Aug 26, 2015, at 2:30 AM, Furkan KAMACI <furkankam...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> Hi Ted, >>>>>>> >>>>>>> I'll check Zookeeper connection but another test method which runs >>>>>>> on hbase without Spark works without any error. Hbase version is >>>>>>> 0.98.8-hadoop2 and I use Spark 1.3.1 >>>>>>> >>>>>>> Kind Regards, >>>>>>> Furkan KAMACI >>>>>>> 26 Ağu 2015 12:08 tarihinde "Ted Yu" <yuzhih...@gmail.com> yazdı: >>>>>>> >>>>>>>> The connection failure was to zookeeper. >>>>>>>> >>>>>>>> Have you verified that localhost:2181 can serve requests ? >>>>>>>> What version of hbase was Gora built against ? >>>>>>>> >>>>>>>> Cheers >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Aug 26, 2015, at 1:50 AM, Furkan KAMACI <furkankam...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I start an Hbase cluster for my test class. I use that helper >>>>>>>> class: >>>>>>>> >>>>>>>> >>>>>>>> https://github.com/apache/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/util/HBaseClusterSingleton.java >>>>>>>> >>>>>>>> and use it as like that: >>>>>>>> >>>>>>>> private static final HBaseClusterSingleton cluster = >>>>>>>> HBaseClusterSingleton.build(1); >>>>>>>> >>>>>>>> I retrieve configuration object as follows: >>>>>>>> >>>>>>>> cluster.getConf() >>>>>>>> >>>>>>>> and I use it at Spark as follows: >>>>>>>> >>>>>>>> sparkContext.newAPIHadoopRDD(conf, MyInputFormat.class, clazzK, >>>>>>>> clazzV); >>>>>>>> >>>>>>>> When I run my test there is no need to startup an Hbase cluster >>>>>>>> because Spark will connect to my dummy cluster. However when I run my >>>>>>>> test >>>>>>>> method it throws an error: >>>>>>>> >>>>>>>> 2015-08-26 01:19:59,558 INFO [Executor task launch >>>>>>>> worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn >>>>>>>> (ClientCnxn.java:logStartConnect(966)) - Opening socket connection to >>>>>>>> server localhost/127.0.0.1:2181. Will not attempt to authenticate >>>>>>>> using SASL (unknown error) >>>>>>>> >>>>>>>> 2015-08-26 01:19:59,559 WARN [Executor task launch >>>>>>>> worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn >>>>>>>> (ClientCnxn.java:run(1089)) - Session 0x0 for server null, unexpected >>>>>>>> error, closing socket connection and attempting reconnect >>>>>>>> java.net.ConnectException: Connection refused at >>>>>>>> sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at >>>>>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) >>>>>>>> at >>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) >>>>>>>> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) >>>>>>>> Hbase tests, which do not run on Spark, works well. When I check >>>>>>>> the logs I see that cluster and Spark is started up correctly: >>>>>>>> >>>>>>>> 2015-08-26 01:35:21,791 INFO [main] hdfs.MiniDFSCluster >>>>>>>> (MiniDFSCluster.java:waitActive(2055)) - Cluster is active >>>>>>>> >>>>>>>> 2015-08-26 01:35:40,334 INFO [main] util.Utils >>>>>>>> (Logging.scala:logInfo(59)) - Successfully started service >>>>>>>> 'sparkDriver' on >>>>>>>> port 56941. >>>>>>>> I realized that when I start up an hbase from command line my test >>>>>>>> method for Spark connects to it! >>>>>>>> >>>>>>>> So, does it means that it doesn't care about the conf I passed to >>>>>>>> it? Any ideas about how to solve it? >>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >>> >> >