Hi Shannon, Here's a little background to help track down what might be happening.
Information Studio uses the Content Processing Framework (CPF) and a dedicated "Fab" database under the covers to do transformations. If you have transformations configured, the documents that your collector ingests go into the Fab database in a particular directory corresponding with the CPF processing domain for your Flow. Information Studio creates a pipeline for you and attaches it to the domain, including a final step that inserts the document into your destination database. (If you don't have transformations configured, the documents go straight into your destination database without going through Fab.) In the pipeline, Information Studio uses a generic XSLT CPF handler that was added as part of XSLT support, and frankly I'm not sure how it handles cases where there are multiple output documents. Take a look at your Fab database. Are the documents you expected sitting in there? What do the URIs look like? One possibility is that the documents are in Fab, have been assigned URIs outside of the processing domain, so they're not being picked up by CPF and moved through the pipeline and into your target database. If the documents are not in Fab, then we need to back up and understand what that generic handler is doing with the output documents (so yes, a stylesheet and simple input document would be really helpful). That said, for this release Flows are optimized for single-document-in, single-document-out, linear pipelines. Document splitting is a common scenario, and there are a couple of other approaches you could try. One that I like is to write a custom Collector that does the document splitting up front, before the documents are inserted. There's a nice example posted to github by Justin Makeig along with his intro video (see the video and the link to the github project at http://developer.marklogic.com/blog/information-studio-intro-video). If splitting is your only processing step, this has an advantage in that the documents don't need to go through Fab, which is much faster. And if what you're doing is generic, then it's particularly nice because you can reuse that Collector in many flows, without having to add the transformation step each time. --Colleen Colleen Whitney MarkLogic Corporation Phone +1 650 655 2366 email [email protected] web www.marklogic.com This e-mail and any accompanying attachments are confidential. The information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this e-mail communication by others is strictly prohibited. If you are not the intended recipient, please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation. ________________________________________ From: [email protected] [[email protected]] On Behalf Of Shannon [[email protected]] Sent: Friday, October 22, 2010 8:44 AM To: General Mark Logic Developer Discussion Subject: [MarkLogic Dev General] Information Studio Transform Hi MarkLogic, For a Transformation Step in an InfoStudio Flow, I've inserted a stylesheet as an XSLT Transform, which basically splits a TEI corpus into individual XML documents, and constructs some metadata in a certain namespace for indexing purposes (I can post the thing if that would help)--it works in Oxygen with Saxon-PE 9.2.0.6 as the specified transformer; however, it effectively does nothing as part of the flow editor--only 1 document is reported as loaded, and the split TEI docs don't appear in the database. Thanks in advance for your help, I'm really looking forward to finding out what I'm doing wrong--moving ahead, thanks to 4.2, content loading with a persistent flow including a preprocessing transform step will really help cut down on the number of steps in my workflow, naturally. This is with MarkLogic Server, Standard Edition, Personal License, OS X Dev-only, as we haven't yet upgraded our Red Hat Enterprise servers to 4.2. Thanks, Shannon _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
