Yes with name import if I run it as a standalone it works perfectly fine but when I try to do it over hadoop then it goes haywire.
I have to assume then a simple UIMA application with does a simple name annotation will also not run in that case Regards Rohan On Wed, Jun 11, 2008 at 6:35 PM, Thilo Goetz <[EMAIL PROTECTED]> wrote: > That's most likely because the XML isn't valid :-) > Seriously, the "no content allowed in prolog" message > is sometimes due to an incorrect text encoding. > > Does this run ok locally? > > --Thilo > > > rohan rai wrote: > >> Thanks Thilo. Well If do that all sorts of invalid xml exception is >> getting >> thrown >> >> org.apache.uima.util.InvalidXMLException: Invalid descriptor at >> <unknown source>. >> at >> org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:193) >> at >> org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:365) >> at >> org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:346) >> at org.ziva.dq.hadoop.DQHadoopMain$Map.dQFile(DQHadoopMain.java:45) >> at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:37) >> at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:1) >> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208) >> at >> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084) >> Caused by: org.xml.sax.SAXParseException: Content is not allowed in >> prolog. >> at >> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231) >> at >> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) >> at >> org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:176) >> ... 8 more >> org.apache.uima.util.InvalidXMLException: Invalid descriptor at >> <unknown source>. >> at >> org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:193) >> at >> org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:365) >> at >> org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:346) >> at org.ziva.dq.hadoop.DQHadoopMain$Map.dQFile(DQHadoopMain.java:45) >> at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:37) >> at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:1) >> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208) >> at >> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084) >> Caused by: org.xml.sax.SAXParseException: Content is not allowed in >> prolog. >> at >> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231) >> at >> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) >> at >> org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:176) >> >> >> >> On Wed, Jun 11, 2008 at 6:08 PM, Thilo Goetz <[EMAIL PROTECTED]> wrote: >> >> You need to use import by name instead of import >>> by location in your descriptor. Then things get >>> loaded via the classpath and you should be ok >>> (provided that you stick your descriptors in the >>> jar of course). I suggest you test this locally >>> first by moving your application to a different >>> machine where you don't have any descriptors >>> lying around. It'll be easier to debug than in >>> hadoop. >>> >>> --Thilo >>> >>> >>> rohan rai wrote: >>> >>> Well the question is for running UIMA over hadoop? How to do that as in >>>> UIMA >>>> there are xml descriptors which have relative urls and location? Which >>>> throws exception >>>> >>>> But I can probably do without that answer >>>> >>>> Simplifying the problem >>>> >>>> I create a jar for my application and I am trying to run a map reduce >>>> job >>>> >>>> In the map I am trying to read an xml resource which gives this kind of >>>> exceprion >>>> >>>> java.io.FileNotFoundException: >>>> >>>> >>>> /tmp/hadoop-root/mapred/local/taskTracker/jobcache/job_200806102252_0028/task_200806102252_0028_m_000000_0/./descriptors/annotators/RecordCandidateAnnotator.xml >>>> (No such file or directory) >>>> at java.io.FileInputStream.open(Native Method) >>>> at java.io.FileInputStream.<init>(FileInputStream.java:106) >>>> at java.io.FileInputStream.<init>(FileInputStream.java:66) >>>> at >>>> >>>> sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70) >>>> at >>>> >>>> sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161) >>>> at java.net.URL.openStream(URL.java:1009) >>>> at >>>> org.apache.uima.util.XMLInputSource.<init>(XMLInputSource.java:83) >>>> >>>> I think I require to pass on the content of the jar which contains the >>>> resource xml and classes(other than the JOB class) to each and every >>>> taskXXXXXXX getting created >>>> >>>> How can I do that >>>> >>>> REgards >>>> Rohan >>>> >>>> >>>> >>>> >>>> On Wed, Jun 11, 2008 at 5:12 PM, Michael Baessler < >>>> [EMAIL PROTECTED]> >>>> wrote: >>>> >>>> rohan rai wrote: >>>> >>>>> Hi >>>>>> A simple thing such as a name annotator which has an import location >>>>>> of >>>>>> type starts throwing exception when I create a jar of the application >>>>>> I >>>>>> >>>>>> am >>>>> >>>>> developing and run over hadoop. >>>>>> >>>>>> If I have to do it a java class file then I can use XMLInputSource in >>>>>> = >>>>>> >>>>>> new >>>>> >>>>> >>>>> XMLInputSource(ClassLoader.getSystemResourceAsStream(aeXmlDescriptor),null); >>>>> >>>>> But the relative paths in annotators, analysis engines etc starts >>>>>> >>>>>> throwing >>>>> >>>>> exception >>>>>> >>>>>> Please Help >>>>>> >>>>>> Regards >>>>>> Rohan >>>>>> >>>>>> I'm not sure I understand your question, but I think you need some >>>>>> help >>>>>> >>>>> with the exceptions you get. >>>>> Can you provide the exception stack trace? >>>>> >>>>> -- Michael >>>>> >>>>> >>>>> >>
