Ahh, I saw your message after I sent mine, but my suggestion would still be valid.

Gokhan

On 3/16/2012 12:44 PM, Tim Smith wrote:
One small correction - I'm using an Iterate Over Select module, not Bind by Select to process each file.

Thanks,

Tim


On Fri, Mar 16, 2012 at 3:35 PM, Tim Smith <[email protected] <mailto:[email protected]>> wrote:

    Hi,

    I'm attempting to process ~250 XML files into RDF.  I created a
    schema for the files using XMLSpy and imported the schema into TBC
    using the XSD importer.  This created two .ttl files.

    I created an SM script that iterates over the files using
    tops:files via a bind by select module.  Prior to the Bind by
    Select, I import the schema ontologies and my target ontology.  In
    the body, I import each XML file, convert it to RDF and then run a
    series of CONSTRUCT queries to map each file into the target
    ontology.  The combination of all triples generated is then saved
    to disk.

    The script works fine if I only run through a small number of
    files.  However, if I try to hit all 250 at once, it just runs
    slower and slower and slower...  The slow part seems to be the
    CONSTRUCT queries.  They run fast initially but slow significantly
    after 10-20 files.  For every file that I have manually tested by
    running the CONSTRUCT query in the SPARQL view, the query has
    always run very fast so I do not know why performance is so poor
    running as an SM script.

Any suggestions? Are there things I can do to speed this along? Is there data that I can collect to better inform you?

    My current work around is to process each directory individually
    but even that hits the problem because some directories have 10's
    of files (not to mention the obvious hassle of changing the script
    - file names, base URIs, etc... for each directory)

    I'm using 3.6B on win7/64 with 5G allocated to the JVM.

    Thanks,

    Tim


--
You received this message because you are subscribed to the Google
Group "TopBraid Suite Users", the topics of which include Enterprise Vocabulary Network (EVN), TopBraid Composer,
TopBraid Live, TopBraid Ensemble, SPARQLMotion and SPIN.
To post to this group, send email to
[email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/topbraid-users?hl=en

--
You received this message because you are subscribed to the Google
Group "TopBraid Suite Users", the topics of which include Enterprise Vocabulary 
Network (EVN), TopBraid Composer,
TopBraid Live, TopBraid Ensemble, SPARQLMotion and SPIN.
To post to this group, send email to
[email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/topbraid-users?hl=en

Reply via email to