Hello Patrick, On Dec 11, 2009, at 17:46 , Patrick Ohly wrote:
> I'm currrently thinking about synchronizing files and/or complete > directory hierarchies. I very much welcome this idea! > Is there any kind of support for OMA-TS-DS DataObjFile in libsynthesis? Yes. > Section 14.4 mentiones 'plugin module “FILEOBJ”', but I am not sure > whether that is enabled and/or included in the open source code? "nm -C > libsynthesis.a | grep SDK_fileobj" says its not included. The Fileobj plugin module is an example DB adaptor intended as backend for a OMA-DS DataObjFile. We did not include it because of its experimental status - Beat created it some years ago to be able to test the OMA-DS DataObjFile datatype. > The DataObjFile format describes XML tags to represent file name, > attributes and content. If this was a MIME dir based format, I knew how > to write a profile, but with XML, we would need something akin to > mimedirprofile.cpp/h, right? Yes, and that something already exists as dataobjtype.h/.cpp. This is a datatype similar to mimediritemtype and textitemtype (the one that can be used for RFC2822 mail), but implements the XML/WBXML format of OMA-DS DataObj. Beat derived this from TextItem in 2005, and it can embed the textprofile.h/.cpp functionality to implement OMA-DS EmailObj. We haven't used it for a long while in lack of a real world use case (or rather time to establish one - some ideas like camera roll photo sync were around for a while), but it should be functional. The configuration of a dataObj is not yet documented, but works similar to a textItem. Instead of <linemap>s which map fields to particular lines or headers in a text, it has <tagmap>s to map fields to XML tags. Find a snippet from a test config we used back in 2005 at the end of this message. > Regarding DataObjFile, is the directory name is meant to be included in > the <name> tag? I suspect that SyncML's <SourceParent> in combination > with DataObjFolder is meant to be used. libsynthesis has only > rudimentary (no?) support for the parent ID, hasn't it? I remember that > it is part of the DB API, without any use in the engine at the moment. Correct. The implementation of hierarchic data stores is completely missing at this time. When defining the DB Api, we made sure it has the required "parentID" fields to be prepared. > When discussing the strengths of libsynthesis, it was mentioned that > data can be extracted directly from WBXML. I understand that part and > now wonder about the next step: can large items also be streamed > directly to and from the underlying database? The architecture is there, and also in use, but not yet to the full extent possible. > I'm thinking of a field list which contains one field that represents > binary file content. When synchronizing a large file, it would be nice > if the content of that field could be read and written from disk instead > of buffering the whole item in memory. What is implemented are so-called field proxies (for large string fields and BLOBs). When slow-syncing large data (the use case was email back in 2003) it is very important the engine does not try to load these large items when loading the sync set for slow sync matching, but only meta data. It was not so important to avoid single items to be held in memory one after another, so complete streaming through is not yet implemented. The problem with stream-through is that it would require format encoders (dataobjtype in particular) to get aware of object chunking at the SyncML protocol level and vice versa. At this time, SyncML object chunking (and reassembly) is done in memory. A new mechanism for chunked encoding and decoding at the SyncItemType level would be needed. > The only part in the DB API that looks feasible for that is > Read/WriteBlob and there is indeed an entry in the SDK pdf about this: > > There are two extensions to this syntax: > • BLOBs: For binary large object blocks the field contains only a > reference to the BLOB > identifier which will be read and written with ReadBlob/WriteBlob. > Syntax: aa;BLOBID=xyz where <xyz> is the name of the BLOB. > > But that is about the syntax in ReadNextItem; how can the same be done > with ReadNextItemAsField? Probably ReadNextItemAsKey? With the text format, you can just tell the engine that a proxy for a certain field (BLOB) should be created by returning the BLOBID=xxx part. In fact, you MUST do it that way because there's no way to encode BLOBs in the text format. The actual content will be read by the field proxy not before the BLOB contents are actually needed by the engine (= often not at all after loading a sync set for a slow sync, because slow sync matches are usually done with metadata, not BLOB contents). For the xxxAsKey variant, you can directly read and write binary data using GetValue/SetValue, but you can't return the BLOBID=xxx part. So here it works differently, you need to specify (as you correctly guessed) "p" in the mode for those fields in the <map>s. The engine will derive the BlobID from the <map> name (and add a [n] if the mapped field is an array). > How about storing a new or updated item in the > DB - would the engine then ask the DB layer to create a blob once it > starts parsing an item which contains such a field? There is a field > type "blob", but I'm not sure how that works. For storing data that came in via SyncML, the engine will call WriteBlob for "p" mapped fields. But as there is no chunked decoding yet, the WriteBlob calls for one BLOB will all happen at the same time, when the entire data item is fully received and assembled in memory from the peer. Now this is still more efficient than using SetValue would be, where a SECOND buffer of the entire item size was required. And if a DB plugin correctly implements Read/WriteBlob now, it will work without a change with a fully end-to-end streaming engine implementation. > I thought I'd better ask here first. Of course, there's always the source > code ;-) :-) I consulted it as well to write the above... Best Regards, Lukas Config snippet for a fileobj fieldlist and datatype > <fieldlist name="file_fields"> > <field name="SYNCLVL" type="integer" compare="never"/> > <field name="DOCTYPE" type="string" compare="conflict"/> > <field name="FILE_NAME" type="string" compare="conflict"/> > <field name="TIME_CREATED" type="timestamp" compare="conflict"/> > <field name="TIME_MODIFIED" type="timestamp" compare="conflict"/> > <field name="TIME_ACCESSED" type="timestamp" compare="conflict"/> > > <field name="ATTRIBUTE_H" type="integer" compare="conflict"/> > <field name="ATTRIBUTE_S" type="integer" compare="conflict"/> > <field name="ATTRIBUTE_A" type="integer" compare="conflict"/> > <field name="ATTRIBUTE_D" type="integer" compare="conflict"/> > <field name="ATTRIBUTE_W" type="integer" compare="conflict"/> > <field name="ATTRIBUTE_R" type="integer" compare="conflict"/> > <field name="ATTRIBUTE_X" type="integer" compare="conflict"/> > > <field name="CONTENT_TYPE" type="string" compare="conflict"/> > <field name="BODY_ENC" type="string" compare="conflict"/> > <field name="BODY" type="blob" compare="never"/> > <field name="SIZE" type="integer" compare="conflict"/> > > <field name="EXT_NAME" type="string" compare="conflict"/> > <field name="EXT_VERS" type="string" array="yes" > compare="conflict"/> > <field name="EXT_VAL1" type="string" compare="conflict"/> > <field name="EXT_VAL2" type="string" compare="conflict"/> > <field name="EXT_VAL3" type="string" compare="conflict"/> > </fieldlist> > > > <datatype name="FILEOBJ" basetype="dataobj"> > <use fieldlist="file_fields"/> > <typestring>application/vnd.omads-file+xml</typestring> > <versionstring>1.0</versionstring> > > <tagmap field="DOCTYPE"> > <xmltag>DOCTYPE</xmltag> > </tagmap> > > <tagmap field="FILE_NAME"> > <xmltag>name</xmltag> > </tagmap> > > <tagmap field="TIME_CREATED"> > <xmltag>created</xmltag> > </tagmap> > > <tagmap field="TIME_MODIFIED"> > <xmltag>modified</xmltag> > </tagmap> > > <tagmap field="TIME_ACCESSED"> > <xmltag>accessed</xmltag> > </tagmap> > > <tagmap field="ATTRIBUTE_H"> > <xmltag>h</xmltag> > <booltype>true</booltype> > <parent>attributes</parent> > </tagmap> > > <tagmap field="ATTRIBUTE_S"> > <xmltag>s</xmltag> > <booltype>true</booltype> > <parent>attributes</parent> > </tagmap> > > <tagmap field="ATTRIBUTE_A"> > <xmltag>a</xmltag> > <booltype>true</booltype> > <parent>attributes</parent> > </tagmap> > > <tagmap field="ATTRIBUTE_D"> > <xmltag>d</xmltag> > <booltype>true</booltype> > <parent>attributes</parent> > </tagmap> > > <tagmap field="ATTRIBUTE_W"> > <xmltag>w</xmltag> > <booltype>true</booltype> > <parent>attributes</parent> > </tagmap> > > <tagmap field="ATTRIBUTE_R"> > <xmltag>r</xmltag> > <booltype>true</booltype> > <parent>attributes</parent> > </tagmap> > > <tagmap field="ATTRIBUTE_X"> > <xmltag>x</xmltag> > <booltype>true</booltype> > <parent>attributes</parent> > </tagmap> > > <tagmap field="CONTENT_TYPE"> > <xmltag>cttype</xmltag> > </tagmap> > > <tagmap field="BODY_ENC"> > <xmltag>body</xmltag> > <xmlattr>enc</xmlattr> > </tagmap> > > <tagmap field="BODY"> > <xmltag>body</xmltag> > </tagmap> > > <tagmap field="SIZE"> > <xmltag>size</xmltag> > </tagmap> > > <tagmap field="EXT_NAME"> > <xmltag>XNam</xmltag> > <parent>Ext</parent> > </tagmap> > > <tagmap field="EXT_VERS"> > <xmltag>XVers</xmltag> > <parent>Ext</parent> > </tagmap> > > <tagmap field="EXT_VAL1"> > <xmltag>XVal</xmltag> > <parent>Ext</parent> > </tagmap> > > <tagmap field="EXT_VAL2"> > <xmltag>XVal</xmltag> > <parent>Ext</parent> > </tagmap> > > <tagmap field="EXT_VAL3"> > <xmltag>XVal</xmltag> > <parent>Ext</parent> > </tagmap> > </datatype> > Lukas Zeller ([email protected]) - Synthesis AG, SyncML Solutions & Sustainable Software Concepts [email protected], http://www.synthesis.ch _______________________________________________ os-libsynthesis mailing list [email protected] http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis
