I have another question. As a side-solution (as in, something that needs to be done soon and can be quick and dirty), would it be possible to pipe the Nutch output somewhere and then using a cron job (or some timed process) to import it into Hive? Has anyone done this?
On Wed, May 1, 2013 at 11:16 PM, Renato Marroquín Mogrovejo < [email protected]> wrote: > Hi Yves, > > Just get your head around hadoop and start playing around with it. > Nutch is a great place to start getting familiarized with Gora. > We will help you any time you need it and then you can help us push > Gora forward (: > > > Renato M. > > 2013/5/1 Yves S. Garret <[email protected]>: > > Hi Renato, > > > > Sounds kinda fun :) . I'll start reading here about Gora in > > order to better understand. I'm also reading Hadoop: The > > Definitive Guide. > > > > http://gora.apache.org/ > > > > Should I look into anything else in order to learn more about > > Gora? I'll have to admit, I'm new to Hadoop and don't have > > the firmest grasp of the internals. I'm not sure how useful > > I'll be at this moment. > > > > > > On Tue, Apr 30, 2013 at 6:21 PM, Renato Marroquín Mogrovejo < > > [email protected]> wrote: > > > >> Hi Yves, > >> > >> Apache Gora does not support Apache Hive just yet, but we have it on > >> our future plans. If you were willing to dive into an adventure with > >> Gora we would be happy to help you out with that. > >> There is a Pig-Gora adapter patch on JIRA, maybe you would like to > >> give it a look? Although there is a bit of work involved in that one > >> as well. > >> > >> > >> Renato M. > >> > >> 2013/4/30 Yves S. Garret <[email protected]>: > >> > Ok. I think I understand. So there's an adapter involved here. > >> > > >> > > >> > On Tue, Apr 30, 2013 at 5:04 PM, Tejas Patil < > [email protected] > >> >wrote: > >> > > >> >> Nutch 2.x series is built on Gora which offers storage abstraction. > From > >> >> the Gora project main page, I think gora has adapters for accessing > the > >> >> data and making analysis through Apache Hive but it wont support > >> storing of > >> >> nutch data into hive. > >> >> > >> >> There are Gora experts on this group who can answer better. > >> >> > >> >> > >> >> On Tue, Apr 30, 2013 at 1:46 PM, Yves S. Garret > >> >> <[email protected]>wrote: > >> >> > >> >> > Hello, > >> >> > > >> >> > I'm curious. If I wanted to store the URLs for Nutch (version > 2.1) in > >> >> Hive > >> >> > (version > >> >> > 0.9.0) and then store the output from Nutch in Hive, how would I do > >> that? > >> >> > Any > >> >> > pointers? > >> >> > > >> >> > I've googled for "nutch hive" (maybe there's a better term?), but > >> haven't > >> >> > found > >> >> > anything specific or very helpful. I'll keep looking and > >> experimenting. > >> >> > Your help is > >> >> > appreciated. > >> >> > > >> >> > --Yves > >> >> > > >> >> > >> >

