Hi, I have a situation where I have 10,000+ small text files in a rather deep file system tree. Ultimately, I need to attach these files to instances in an ontology based on the presence/absence of various words in the files.
For example, if a file mentions the name of a particular database table, that file should be attached to the instance that is the database table. I believe the embedded Lucene engine can do this. I was thinking that if I can sequentially process each file, I can make each one an instance of a File class where each instance has a name (the file name) and a property that contains the contents of the file. Then if I can trigger Lucene to index the resulting text strings, I will be able to look for the key words I'm interested in and Construct the tagging relationships (using pf:Match). However, I'm struggling with how to process all the files in a given directory and how to trigger Lucene to index the text strings added to a model. SPARQLMotion supports importing a single text file but does not appear to support multiple files/directory trees. Any suggestions on how to process large numbers of files and trigger the indexing process? Thanks in advance, Tim -- You received this message because you are subscribed to the Google Group "TopBraid Suite Users", the topics of which include Enterprise Vocabulary Network (EVN), TopBraid Composer, TopBraid Live, TopBraid Ensemble, SPARQLMotion and SPIN. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/topbraid-users?hl=en
