|
|
|
| The following comment has been added to this issue: |
[ Permlink ] |
|
Author: Thierry Delprat
Date: 11/05/07 04:19
Comment:
Some more ideas about the future NXIO (other names welcomed).
Simple Readers :
================
Core Reader
-----------
Reads documents from the core and returns a pagined list of documentModels.
The data extraction will be at first configured by giving a NXQL Query.
XML Reader
----------
Reads XML Files and generate a list of documentModels.
The DM may be very basic but need to have at least :
- a path
- DataModels filled and named
About the XML format
--------------------
The input source will be configured via an URL : at start the file:// protocol will be implemented.
The source may be :
- a folder containg XML or zip files
- a zip file (zipped folder containing XML files)
Each XML represents one Document : simple xml tree document/schemas/schemaX/fieldY
The idea is that the document/schemas node should be valid against schemas XSD.
The root node (document) will also have some special children
- "transfert"
- Source (label)
- Date and hour of generation
- BlobRepresentation
- externalized
- externalized with digest
- base64 inline
- RelativePath
- signature/digest of the document
- "type"
- Typename
- "facets" ro store the list of facet of the document
- lifecycle
- "security" ACP XML representation
As a first implementation I guess we can start without security and lifecycle.
All data in XML will be UTF-8.
The blobs can be stored inline in base64.
The blobs can be externalized.
If documentA.xml contains the XML export, the externalized files will be at the same level.
The blobs file name can be arbitrary : it just as to be referenced in the main XML file.
=> for example <externalizedBlob> file://documentA_fieldY.xxx </externalizedBlob>
In the mid terme, it will be usefull to also store inside the xml file a signature/digest of the externalised blob.
=> at first a simple SHA1 would be good
Simple Writter
==============
Core Writer
-----------
Reads a pagined list of DM and create them into the repo.
Configuration includes :
- target core/domain
- base path
- base ACL
- default folder type
The default folder type will be usefull because in some systems, the folders are not document : the xml export format only exports documents with path, there may be missing folderish nodes.
(Needed for processing Lotus Notes XML exports)
XML Writter
-----------
Write XML representation to the configured output URL (for now file only).
Configuration :
- target path
- externalize Blob true/false
Simple converter
================
The converter may be used for :
- type mapping
Define the target type when reading from a file system.
=> at start we will just define a default type
==> all incomming nodes with type will have the default one
=> and also a simple type bijection (Benjamin need something like that)
- RelativePath mapping
May be used to define the path of the document to import when reading from a file system.
=> at start just enable/disable the addition of the source relative path as prefix the target path.
- security
Configurable ACL based on the meta-data and some rules
=> at start nothing
- filtring
Hide some data when reading from the core
=> at start a list of schemas that are not exported
ETL
===
The purpous is not to replace an ETL.
But some very "content oriented" mapping will be difficult to do with ETL :
- type mapping
- right mapping
- ...
In the mid terme, an ETL like Talend could be plugged to the nuxeo import/export pipe :
- via the XML file input/output
- via a set of connectors to Talend that provide readers and writters for NXIO.
|
|
Implement an import / export feature that allows to export a list of documents (any type) and repository-related information (ex: security, versioning, etc.).
The export format should be XML based and reuse the XMLSchema definition of content type.
|
|
|
|
![]() |
|
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets