Re: Nutch 1.0 trunk Fetch Schedule
well you can always write a bash script or a java class that does this. writing a java class is probably better and easier. you have a manual for importing nutch into eclipse in case you don't know how. i needed a similar thing done and it turned out that using java really is easier... On Wed, Mar 18, 2009 at 12:36 PM, MyD myd.ro...@googlemail.com wrote: Hi @ all, is it possible to set the next fetch schedule for a url in another crawl dir? Example: crawl.dir.A - retrieve links and set the fetch schedule but this should go into the crawl.dir.B crawl.dir.B Thanks in advance Regards, MyD -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22577234.html Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Nutch 1.0 trunk Fetch Schedule
Hi ripper, Thanks, do u know how to do it in java? I tried to, but haven't found the suitable classes. Thanks in advance. Cheers, MyD ripper07 wrote: well you can always write a bash script or a java class that does this. writing a java class is probably better and easier. you have a manual for importing nutch into eclipse in case you don't know how. i needed a similar thing done and it turned out that using java really is easier... On Wed, Mar 18, 2009 at 12:36 PM, MyD myd.ro...@googlemail.com wrote: Hi @ all, is it possible to set the next fetch schedule for a url in another crawl dir? Example: crawl.dir.A - retrieve links and set the fetch schedule but this should go into the crawl.dir.B crawl.dir.B Thanks in advance Regards, MyD -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22577234.html Sent from the Nutch - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22578614.html Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Nutch 1.0 trunk Fetch Schedule
you have a manual for importing nutch into eclipse in case you don't know how can u pl mention the link... thanx in advance ripper07 wrote: well you can always write a bash script or a java class that does this. writing a java class is probably better and easier. you have a manual for importing nutch into eclipse in case you don't know how. i needed a similar thing done and it turned out that using java really is easier... On Wed, Mar 18, 2009 at 12:36 PM, MyD myd.ro...@googlemail.com wrote: Hi @ all, is it possible to set the next fetch schedule for a url in another crawl dir? Example: crawl.dir.A - retrieve links and set the fetch schedule but this should go into the crawl.dir.B crawl.dir.B Thanks in advance Regards, MyD -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22577234.html Sent from the Nutch - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22578664.html Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Nutch 1.0 trunk Fetch Schedule
ok this is how i did it. i created a class in the org.apache.nutch.crawl package, the same package where the crawl class (which is nutch's main class, called by the crawl command). in that class, you create the crawl class with the appropriate parameter. just look at the code once you import it into eclipse or the javadoc. here's the link: http://wiki.apache.org/nutch/RunNutchInEclipse0.9 just a tip, try using google or nutch wiki, there's a lot of useful stuff there :-) On Wed, Mar 18, 2009 at 2:01 PM, n_developer spoo_0...@yahoo.co.in wrote: you have a manual for importing nutch into eclipse in case you don't know how can u pl mention the link... thanx in advance ripper07 wrote: well you can always write a bash script or a java class that does this. writing a java class is probably better and easier. you have a manual for importing nutch into eclipse in case you don't know how. i needed a similar thing done and it turned out that using java really is easier... On Wed, Mar 18, 2009 at 12:36 PM, MyD myd.ro...@googlemail.com wrote: Hi @ all, is it possible to set the next fetch schedule for a url in another crawl dir? Example: crawl.dir.A - retrieve links and set the fetch schedule but this should go into the crawl.dir.B crawl.dir.B Thanks in advance Regards, MyD -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22577234.html Sent from the Nutch - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22578664.html Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Nutch 1.0 trunk Fetch Schedule
ripper07 wrote: ok this is how i did it. i created a class in the org.apache.nutch.crawl package, the same package where the crawl class (which is nutch's main class, called by the crawl command). in that class, you create the crawl class with the appropriate parameter. just look at the code once you import it into eclipse or the javadoc. here's the link: http://wiki.apache.org/nutch/RunNutchInEclipse0.9 Thanx... well i had tried similar thing. was jus wonderin if der ws better way to do the same.. well.. have u tried anything of this sort... if u can help me out wid this... N also i want nutch to perform wildcard query search(as in, if search query is book*, then it shd return al search results whic contain isbn followed by any text) This is possible in luke lucene. But hw can i implement it in nutch search(in my java code) just a tip, try using google or nutch wiki, there's a lot of useful stuff there :-) On Wed, Mar 18, 2009 at 2:01 PM, n_developer spoo_0...@yahoo.co.in wrote: you have a manual for importing nutch into eclipse in case you don't know how can u pl mention the link... thanx in advance ripper07 wrote: well you can always write a bash script or a java class that does this. writing a java class is probably better and easier. you have a manual for importing nutch into eclipse in case you don't know how. i needed a similar thing done and it turned out that using java really is easier... On Wed, Mar 18, 2009 at 12:36 PM, MyD myd.ro...@googlemail.com wrote: Hi @ all, is it possible to set the next fetch schedule for a url in another crawl dir? Example: crawl.dir.A - retrieve links and set the fetch schedule but this should go into the crawl.dir.B crawl.dir.B Thanks in advance Regards, MyD -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22577234.html Sent from the Nutch - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22578664.html Sent from the Nutch - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Nutch-1.0-trunk-Fetch-Schedule-tp22577234p22579158.html Sent from the Nutch - User mailing list archive at Nabble.com.