Re: [DISCUSS] What will happen to Nutch Gora aka Nutchbase (was Re: [VOTE] Move 2.0 out of trunk)

2011-09-19 Thread Mattmann, Chris A (388J)
Note to all: please use the [DISCUSS] thread format to discuss the VOTE, and 
please don't reply all to the VOTE thread and sully up the VOTE tallies with 
discussion.

Radim,

Thanks for your email. What you propose has been suggested as an option. 
The best way to help see it happen sooner rather than later is to get involved 
and/or contribute towards discussion, design, code, etc., for the issues that 
you are interested in. We welcome any contributions in this area. The nutchgora 
branch will still be there, and if there's a desire to have a nutchcassandra or 
nutchhbase 
pure branch, and you have some spare cycles to help see it come about, we would 
welcome it.

Cheers,
Chris

On Sep 19, 2011, at 7:30 AM, Radim Kolar wrote:

 I'm glad to hear that there at least 2 people in the community that 
 do business in their field and proudly use a Nutch-based crawler 
 together with
 Cassandra to store the data through Gora. That would not have been 
 possible with Nutch 1.x version.
 what about to drop Gora, because it is progressing too slowly and make 
 Nutch 2.x only cassandra/hadoop db based ?


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



Re: [DISCUSS] What will happen to Nutch Gora aka Nutchbase (was Re: [VOTE] Move 2.0 out of trunk)

2011-09-19 Thread Mattmann, Chris A (388J)
Hi Radim,

On Sep 19, 2011, at 9:22 AM, Radim Kolar wrote:

 The nutchgora branch will still be there, and if there's a desire to 
 have a nutchcassandra or nutchhbase pure branch, and you have some spare 
 cycles to help see it come about, we would welcome it.
 
 
 it needs to be done in more long term strategic way.
 
 1. research what ppl expect from Nutch 2?
 2. what gora backends they used/ want to use
 3. to drop gora or not

Sure, in fact, there have been several ongoing conversations related to this 
already for 
over a year now.

See these threads:

http://s.apache.org/HhP
http://s.apache.org/zJX
http://s.apache.org/4tC
http://s.apache.org/BkM
http://s.apache.org/ka
http://s.apache.org/Rbi
http://s.apache.org/XZe
http://s.apache.org/X8F
http://s.apache.org/bKr
http://s.apache.org/gu
http://s.apache.org/gN9
http://s.apache.org/OCZ
http://s.apache.org/QID
http://s.apache.org/xk
http://s.apache.org/gw
http://s.apache.org/p6w

Feel free to contribute to the discussion.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



[DISCUSS] What will happen to Nutch Gora aka Nutchbase (was Re: [VOTE] Move 2.0 out of trunk)

2011-09-18 Thread Mattmann, Chris A (388J)
Hi Radim,

Thanks for your feedback. Just to dispel the thought that this VOTE will 
remove the Nutch-with-Gora version of SVN, it won't remove it (not that it 
could 
ever fully remove it anyways since SVN is a version control system it 
Nutch-with-Gora 
will always be around in some form or fashion.

Simply, we are VOTE'ing on a proposal that will move the current Nutch 
trunk at http://svn.apache.org/repos/asf/nutch/trunk to 
http://svn.apache.org/repos/asf/nutch/branches/nutchgora 
and then will merge the current 1.4-development branch at 
http://svn.apache.org/repos/asf/nutch/branches/branch-1.4 
into trunk.

If folks want to leverage Nutch with Gora, and/or contribute to it there, I 
will consider those folks 
candidates for committers as I would anyone that's contributing to trunk 
and I would hope the rest of the Nutch dev community would also. Then, if you 
have the time 
and resources, and others do too, you can selectively move in the relevant 
parts of the system 
into trunk (and help maintain them where it makes sense) as you and the rest of 
the 
community (dev and users) see fit. Commit early, commit often. Discussions with 
the rest 
of the community. Starting small, growing big. All parts of developing in the 
Apache way.

However, the current set of active Nutch committers have found that using their 
expertise 
to maintain the 1.x series of Nutch release (pre-Gora) to be a more productive 
use of their 
time since none of those active Nutch committers are Gora experts (including 
myself). We 
are trying to learn though, at least I know I am. So, given that, we are 
proposing to make 
the Nutch active branch of development (called trunk in SVN terms) the branch 
that 
all of us know how to maintain, and that furthermore, we are getting the most 
questions 
and activity from the user community regarding. 

Hope that helps to clarify.

Cheers,
Chris



On Sep 18, 2011, at 4:08 PM, Radim Kolar wrote:

 -1
 
 I don't want to mark release 2.0 as unmaintained. Cassandra backend 
 works really well for us and fixed performance problems with hadoop 
 database. Instead of moving it out trunk, recruit more ppl should come 
 and fix open problems. don't give up.


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++