Hi, Philip,

         Yesterday I found a software that transformed all .txt file
to .html file and all annotation is done. However, this is not a final
solution because in the future I may have pdf or .doc file to
annotate.

          I am sure the attached document is not annotated. I checked
it in this way: I have a html file which contains the same content
with the .txt file, and use toolpopulate to annotate both of them, and
I use keyword "Rice University" in entity pattern search (object,
whose name is exactly equal to "Rice Univerisity"), and in the
resuult, I saw the html doc is retrieved, but .txt not. I think this
convinced me that .txt file is not annotated.

        Also, from the panel of toolpopulate, it returns the following
message after I chose .txt file to annotate:

Checking (please wait) ...
Check: SUCCESS!

Processing file(s) ...

Completed: 100% ( 1 of 1 files processed )

Indices optimized !

-=[ TOTALS ]=-
Directory files: 1
Start time: Fri Jun 04 08:13:57 CDT 2010
End time: Fri Jun 04 08:13:57 CDT 2010
Total time (ms): 47

-=[ STATISTICS ]=-
Document count: 1
Document size (kb): 0
Create time (ms): 0
Parse features time (ms): 0
Annotation time (ms): 0
Store time (ms): 0
Index sync time (ms): 0
Index opt time (ms): 0
----------------------------------------------------------------
End Time: Fri Jun 04 08:13:57 CDT 2010
----------------------------------------------------------------
Finished.

       From thie message it doesn't look like the file is annotated.

       Thank you very much for your help!

Fangkai

On Fri, Jun 4, 2010 at 6:02 AM, Philip Alexiev
<[email protected]> wrote:
> Hello Fangkai,
>
> Could you send us some of your txt files that you are sure are not
> annotated? This could help us a lot in solving the problem.
>
> Thanks,
> Philip
>
> On 06/03/2010 08:00 PM, Yang Fangkai wrote:
>>
>> hi, Anton,
>>
>>         I tried HTML files, and the population works. But this just
>> doesn't work for txt file...
>>
>>        I checked the populator.xml and found the following configuration:
>>
>>        <INPUT_DOC_EXT>doc,htm,html,txt,page,xml</INPUT_DOC_EXT>
>>
>>        I suspect the populator has already been configured to process
>> txt file. So where is the problem? Thank you!
>>
>> Fangkai
>>
>> 2010/6/3 Yang Fangkai<[email protected]>:
>>
>>>
>>> Anton,
>>>
>>> On Thu, Jun 3, 2010 at 10:39 AM, Anton Andreev
>>> <[email protected]>  wrote:
>>>
>>>>
>>>> Hello Fangkai,
>>>>
>>>> First I would like to point out that the kim-discussion:
>>>> http://ontotext.com/mailman/listinfo/kim-discussion is dedicated for
>>>> asking
>>>> technical questions like this one. Next time please use the
>>>> kim-discussion
>>>> mailing list, not this one. Thanks.
>>>>
>>>>
>>>
>>> Sorry for the mistake. I will use that list the next time.
>>>
>>>
>>>>
>>>> Now back to your problem:
>>>> What version of KIM do you use? KIM 2.4?
>>>>
>>>>
>>>
>>> Yes. I am using KIM2.4 under Windows XP.
>>>
>>>
>>>>
>>>> Are you using the KIMGate hybrid - a GATE developer with KIM's default
>>>> pipeline or the tool called "populater" again from the bin folder?
>>>>
>>>
>>> I started KIM by running startkim.bat, and the populator by running
>>> toolPopulate.cmd in tool folder. I didn't see the tool "populator" in
>>> the bin folder.
>>>
>>>
>>>>
>>>> The later
>>>> only needs a document source folder and uses an already running KIM
>>>> instance. Do you see that the documents are being annotated? What
>>>> results do
>>>> you expect, what is missing?
>>>>
>>>>
>>>
>>> Here is what I expect. I have a corpus containing about 2000 docs, and
>>> I want to query over these docs. So I plan to use toolPopulate to
>>> extract entities over these docs (this is what I am trying to do), and
>>> then query over them. I expect to see the entities populated from
>>> these docs, but I didn't see any meaningful entities when I query the
>>> entity from the KIM GUI.
>>>
>>> I don't know if the above makes sense. Thank you!
>>>
>>> Fangkai
>>>
>>>
>>>
>>>>
>>>> The steps you are doing are correct in general.
>>>>
>>>> Best regards,
>>>> Anton Andreev
>>>>
>>>> --
>>>> Anton Andreev
>>>> Account Manager
>>>> Ontotext AD
>>>> Tel: +359 2 875 81 17
>>>> Fax:+359 2 975 32 26
>>>> email: [email protected]
>>>> www.ontotext.com
>>>>
>>>>
>>>>
>>>> On 3.6.2010 г. 18:17 ч., KIM Platform info newsletter wrote:
>>>>
>>>>>
>>>>> Dear List,
>>>>>
>>>>>          I am trying to use Populate GUI to populate entities from my
>>>>> own corpus. I have downloaded the raw file of PennTree bank, i.e., the
>>>>> articles from Wall Street Journal in plain text form, and refer to the
>>>>> folder in Populate GUI. However, it seems no entities is populated. I
>>>>> try to add an .xml file with the same name of the text file, but still
>>>>> doesn't work. (I check that by first deleting all files from
>>>>> /context/default/populated, and populate entities from a file, and
>>>>> check the entities by querying the entities at
>>>>> http://localhost:8080/kim, but no meaningful entities found). I am
>>>>> wondering if I miss some steps or important configurations. Thank you
>>>>> very much!
>>>>>
>>>>> Best,
>>>>>
>>>>> Fangkai
>>>>> _______________________________________________
>>>>> interested-in-KIM mailing list
>>>>> [email protected]
>>>>> http://ontotext.com/mailman/listinfo/interested-in-kim
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Fangkai Yang, Ph.D student
>>> Taylor Hall 3.150A
>>> Department of Computer Sciences
>>> The University of Texas at Austin
>>> Austin, 78712-0233, Texas
>>> USA
>>> http://www.cs.utexas.edu/~fkyang
>>> email: [email protected]
>>>
>>>
>>
>>
>>
>
>
> --
> Philip Alexiev<[email protected]>
> Software Engineer
> Ontotext AD
>
>



-- 
Fangkai Yang, Ph.D student
Taylor Hall 3.150A
Department of Computer Sciences
The University of Texas at Austin
Austin, 78712-0233, Texas
USA
http://www.cs.utexas.edu/~fkyang
email: [email protected]
_______________________________________________
Kim-discussion mailing list
[email protected]
http://ontotext.com/mailman/listinfo/kim-discussion

Reply via email to