Re: [DISCUSS] What will happen to Nutch Gora aka Nutchbase (was Re: [VOTE] Move 2.0 out of trunk)
Note to all: please use the [DISCUSS] thread format to discuss the VOTE, and please don't reply all to the VOTE thread and sully up the VOTE tallies with discussion. Radim, Thanks for your email. What you propose has been suggested as an option. The best way to help see it happen sooner rather than later is to get involved and/or contribute towards discussion, design, code, etc., for the issues that you are interested in. We welcome any contributions in this area. The nutchgora branch will still be there, and if there's a desire to have a nutchcassandra or nutchhbase pure branch, and you have some spare cycles to help see it come about, we would welcome it. Cheers, Chris On Sep 19, 2011, at 7:30 AM, Radim Kolar wrote: I'm glad to hear that there at least 2 people in the community that do business in their field and proudly use a Nutch-based crawler together with Cassandra to store the data through Gora. That would not have been possible with Nutch 1.x version. what about to drop Gora, because it is progressing too slowly and make Nutch 2.x only cassandra/hadoop db based ? ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [DISCUSS] What will happen to Nutch Gora aka Nutchbase (was Re: [VOTE] Move 2.0 out of trunk)
Hi Radim, On Sep 19, 2011, at 9:22 AM, Radim Kolar wrote: The nutchgora branch will still be there, and if there's a desire to have a nutchcassandra or nutchhbase pure branch, and you have some spare cycles to help see it come about, we would welcome it. it needs to be done in more long term strategic way. 1. research what ppl expect from Nutch 2? 2. what gora backends they used/ want to use 3. to drop gora or not Sure, in fact, there have been several ongoing conversations related to this already for over a year now. See these threads: http://s.apache.org/HhP http://s.apache.org/zJX http://s.apache.org/4tC http://s.apache.org/BkM http://s.apache.org/ka http://s.apache.org/Rbi http://s.apache.org/XZe http://s.apache.org/X8F http://s.apache.org/bKr http://s.apache.org/gu http://s.apache.org/gN9 http://s.apache.org/OCZ http://s.apache.org/QID http://s.apache.org/xk http://s.apache.org/gw http://s.apache.org/p6w Feel free to contribute to the discussion. Cheers, Chris ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
[DISCUSS] What will happen to Nutch Gora aka Nutchbase (was Re: [VOTE] Move 2.0 out of trunk)
Hi Radim, Thanks for your feedback. Just to dispel the thought that this VOTE will remove the Nutch-with-Gora version of SVN, it won't remove it (not that it could ever fully remove it anyways since SVN is a version control system it Nutch-with-Gora will always be around in some form or fashion. Simply, we are VOTE'ing on a proposal that will move the current Nutch trunk at http://svn.apache.org/repos/asf/nutch/trunk to http://svn.apache.org/repos/asf/nutch/branches/nutchgora and then will merge the current 1.4-development branch at http://svn.apache.org/repos/asf/nutch/branches/branch-1.4 into trunk. If folks want to leverage Nutch with Gora, and/or contribute to it there, I will consider those folks candidates for committers as I would anyone that's contributing to trunk and I would hope the rest of the Nutch dev community would also. Then, if you have the time and resources, and others do too, you can selectively move in the relevant parts of the system into trunk (and help maintain them where it makes sense) as you and the rest of the community (dev and users) see fit. Commit early, commit often. Discussions with the rest of the community. Starting small, growing big. All parts of developing in the Apache way. However, the current set of active Nutch committers have found that using their expertise to maintain the 1.x series of Nutch release (pre-Gora) to be a more productive use of their time since none of those active Nutch committers are Gora experts (including myself). We are trying to learn though, at least I know I am. So, given that, we are proposing to make the Nutch active branch of development (called trunk in SVN terms) the branch that all of us know how to maintain, and that furthermore, we are getting the most questions and activity from the user community regarding. Hope that helps to clarify. Cheers, Chris On Sep 18, 2011, at 4:08 PM, Radim Kolar wrote: -1 I don't want to mark release 2.0 as unmaintained. Cassandra backend works really well for us and fixed performance problems with hadoop database. Instead of moving it out trunk, recruit more ppl should come and fix open problems. don't give up. ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++