I think I got it.....Thanks for all the help you guys.........To make a
simple UIMA app work over hadoop (I did it on pseudo distributed
environment) 3-4 factors come together..
1) the UIMA app along with the mapper reducer and your job main file + the
the resources should be contained within the job jar you created
2) probably all import in the descriptor should be import by name (haven't
verified this works with location)
3) any resource being read in any of the class file should be done via
Classloader
E.g XMLInputSource in = new
XMLInputSource(ClassLoader.getSystemResourceAsStream(aeXmlDescriptor),null);
4) the When any AnalysisEngine or something like that of UIMA is being
getting produced (I am doing it in mapper) then ResourceManager should be
used
E.g. ResourceManager rMng=UIMAFramework.newDefaultResourceManager();
rMng.setExtensionClassPath(str, true); //Here str is the
path to any of the resources which can be obtained via
//ClassLoader.getSystemResource(aeXmlDescriptor).getPath()
rMng.setDataPath(str);
aEngine =
UIMAFramework.produceAnalysisEngine(aSpecifier,rMng,null);
This 4th point has to be considered as when we read a xml without using
classloader by default it reads from temp task directory eg.
/tmp/hadoop-root/mapred/local/taskTracker/jobcache/job_200806112341_0002/task_200806112341_0002_m_000000_0/
But all the resources and classes gets unjarred in
/tmp/hadoop-root/mapred/local/taskTracker/jobcache/job_200806112341_0002/work
directory
So to tell the system to look out for the resources in the correct
directory when not using classloader (which is what UIMA's
XMLInputSource does)
we have to use resource manager
Regards
Rohan
On Thu, Jun 12, 2008 at 12:34 AM, Marshall Schor <[EMAIL PROTECTED]> wrote:
> In the Jar that is being deployed, can you unzip it (Jars can be unzipped
> by any unzip tool) and see if it has in it (among many other things):
>
> <the top level / directory>
> |
> + types
> |
> + recordCandidateType.xml
> in other words, right below the top level, a directory called "types", and
> in that directory, a file called "recordCandidateType.xml" ?
>
> -Marshall
>
>
> rohan rai wrote:
>
>> Anyways just to specify neither import by name nor import by location
>> works....import by name results in following exception . If their is some
>> other way to specify the classpath then I dont know
>>
>> org.apache.uima.resource.ResourceInitializationException: An import
>> could not be resolved. No .xml file with name
>> "types.recordCandidateType" was found in the class path or data path.
>> (Descriptor: <unknown>)
>> at
>> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:121)
>> at
>> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:109)
>> at
>> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:124)
>> at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>> at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>> at
>> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:258)
>> at
>> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:303)
>> at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:383)
>> at org.ziva.dq.hadoop.DQHadoopMain$Map.dQFile(DQHadoopMain.java:64)
>> at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:44)
>> at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:1)
>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
>> at
>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
>>
>>
>> On Wed, Jun 11, 2008 at 7:17 PM, rohan rai <[EMAIL PROTECTED]> wrote:
>>
>>
>>
>>> I am sorry which jar are you talking about....To run UIMA App as a
>>> standalone I do not have to create the jar
>>> Are you saying Create a jar of the APP and then run it as a standalone??
>>>
>>> Regards
>>> Rohan
>>>
>>>
>>>
>>> On Wed, Jun 11, 2008 at 7:10 PM, Thilo Goetz <[EMAIL PROTECTED]> wrote:
>>>
>>>
>>>
>>>> So when you run it in Eclipse, it should run with
>>>> just the jar in the classpath, and no special setup
>>>> for the descriptors. I assume you tried that?
>>>>
>>>> --Thilo
>>>>
>>>>
>>>> rohan rai wrote:
>>>>
>>>>
>>>>
>>>>> All the descriptors are in the jar....The whole app is in the
>>>>> jar.....then
>>>>> only I am running the jar on hadoop
>>>>>
>>>>> Regards
>>>>> Rohan
>>>>>
>>>>> On Wed, Jun 11, 2008 at 6:54 PM, Thilo Goetz <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>> Best to put the descriptor in the jar, as I
>>>>>
>>>>>
>>>>>> said earlier...
>>>>>>
>>>>>>
>>>>>> rohan rai wrote:
>>>>>>
>>>>>> Damn it can be run...somebody really gotcha put it in web ASAP...I
>>>>>>
>>>>>>
>>>>>>> promise
>>>>>>> if I somehow make it run in my m/c I will definitely put it up in my
>>>>>>> blog....
>>>>>>>
>>>>>>> Hey by the way to run UIMA annotator via eclipse with import name I
>>>>>>> have
>>>>>>> to
>>>>>>> add classpath in the build path(using eclipse)... Do I have to do
>>>>>>> something
>>>>>>> special to take care of that when running the same app in hadoop...
>>>>>>> Running
>>>>>>> hadoop via command line....
>>>>>>>
>>>>>>> Regards
>>>>>>> Rohan
>>>>>>>
>>>>>>> On Wed, Jun 11, 2008 at 6:47 PM, Thilo Goetz <[EMAIL PROTECTED]> wrote:
>>>>>>>
>>>>>>> I know for a fact that UIMA applications can be run on hadoop,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> so don't give up too quickly. In your local tests, you need
>>>>>>>> to make sure that the system is really using the descriptor
>>>>>>>> you think it's using (which is why I suggested you test on a
>>>>>>>> different machine), not something it picks up from the environment.
>>>>>>>>
>>>>>>>> --Thilo
>>>>>>>>
>>>>>>>>
>>>>>>>> rohan rai wrote:
>>>>>>>>
>>>>>>>> Yes with name import if I run it as a standalone it works perfectly
>>>>>>>> fine
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> but
>>>>>>>>> when I try to do it over hadoop then it goes haywire.
>>>>>>>>>
>>>>>>>>> I have to assume then a simple UIMA application with does a simple
>>>>>>>>> name
>>>>>>>>> annotation will also not run in that case
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Rohan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>
>>
>
>