For clarity, the log below is the about 4 of 5 my PDF docs that can't be parsed by nutch.
On Fri, Jan 11, 2013 at 8:29 AM, Bayu Widyasanyata <[email protected]>wrote: > nutch parsing is still problem on pdf files. > Only 1 pdf can be parsed successfully. > > 2013-01-11 08:11:23,679 WARN parse.ParseUtil - Unable to successfully > parse content > http://localhost/sapi/nospasi_Akhirat_Lebih_Utama_Daripada_Dunia.pdf of > type application/pdf > > Even I had added on parse-plugins.xml explicitly: > > <mimeType name="application/pdf"> > <plugin id="parse-tika" /> > </mimeType> > > What the missed things? > > On Fri, Jan 11, 2013 at 7:55 AM, Lewis John Mcgibbney < > [email protected]> wrote: > >> No problem at all. >> >> Better safe than sorry. >> >> Lewis >> >> On Thu, Jan 10, 2013 at 4:43 PM, Bayu Widyasanyata >> <[email protected]>wrote: >> >> > Yes, I forgot that things even I already put on my notes on previous >> > installation. >> > I'm quite new on nutch and also Java developments :) >> > >> > Thanks! >> > >> > On Fri, Jan 11, 2013 at 7:01 AM, Lewis John Mcgibbney < >> > [email protected]> wrote: >> > >> > > Hi, >> > > >> > > java.io.IOException: java.lang.ClassNotFoundException: >> > > > com.mysql.jdbc.Driver >> > > > >> > > >> > > If you look at ivy.xml [0] you will see that the mysql-connector-java >> > > dependency is commented out. Please uncomment it, then build Nutch 2.x >> > src >> > > again. >> > > >> > > This will download the dependency and make it available on your >> > classpath. >> > > >> > > Thank you >> > > >> > > Lewis >> > > >> > > [0] >> > > >> http://svn.apache.org/viewvc/nutch/branches/2.x/ivy/ivy.xml?view=markup >> > > >> > >> > >> > >> > -- >> > wassalam, >> > [bayu] >> > >> >> >> >> -- >> *Lewis* >> > > > > -- > wassalam, > [bayu] -- wassalam, [bayu]

