[ 
https://issues.apache.org/jira/browse/OAK-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123155#comment-16123155
 ] 

Chetan Mehrotra edited comment on OAK-6545 at 8/11/17 10:39 AM:
----------------------------------------------------------------

Done the implemenation in 1804763- 1804770

Implementation has following support

* Supports exporting NodeState in json and cnd format
* Export can be done via explicit {{export}} command and a groovy console 
command
* Support serializing blobs in FileDataStore storage i.e. blobs would be stored 
in a local FDS in a directory named "blobs" which would be created beside the 
nodestate.json
* Blob serialization can skip problamatic binaries by writing a marker blobId. 
Such blobs would fail on deserialize and marked as "*ERROR*-<blob id>" in the 
serialized form
* json is written in a streaming way so supports serializing large tree

*Export Command*

Refer to [Oak Run NodeStore 
Connection|https://jackrabbit.apache.org/oak/docs/features/oak-run-nodestore-connection-options.html]
 for details on how to connect to various NodeStore and BlobStore

{noformat}
$ java -jar oak-run-*.jar export -p /path/in/repo /path/of/segmentstore -o 
/path/of/output/dir
$ java -jar oak-run-*.jar export -h
Exports NodeState as json                                 


The export command supports exporting nodes from a repository in json. It also 
provide options to export the blobs
  which are stored in FileDataStore format                                      
                                  

Option                           Description                                    
                                   
------                           -----------                                    
                                   
-b, --blobs [Boolean]            Export blobs also. By default blobs are not 
exported (default: false)             
-d, --depth [Integer]            Max depth to include in output (default: 
2147483647)                              
-f, --filter <String>            Filter expression as json to filter out which 
nodes and properties are included in
                                   exported file (default: {"properties":["*", 
"-:childOrder"]})                   
--filter-file <File>             Filter file which contains the filter json 
expression                             
--format <String>                Export format 'json' or 'txt' (default: json)  
                                   
-n, --max-child-nodes [Integer]  Maximum number of child nodes to include for a 
any parent (default: 2147483647)   
-o, --out <File>                 Output directory where the exported json and 
blobs are stored (default: .)        
-p, --path <String>              Repository path to export (default: /)         
                                   
--pretty [Boolean]               Pretty print the json output (default: true)   
{noformat}

*Export in Groovy Console*
{noformat}
$  java -jar oak-run-*.jar console /path/of/segmentstore
Apache Jackrabbit Oak 1.8-SNAPSHOT
Repository connected in read-only mode. Use '--read-write' for write operations
Jackrabbit Oak Shell (Apache Jackrabbit Oak 1.8-SNAPSHOT, JVM: 1.8.0_66)
Type ':help' or ':h' for help.
----------------------------------------------------------------------------------------------------------------------------
/> cd /var/reports
/var/reports> export -c
{
 "jcr:primaryType": "nam:sling:Folder",
 "jcr:mixinTypes": [
  "nam:rep:AccessControllable"
 ],
 "jcr:createdBy": "admin",
 "jcr:created": "dat:2017-01-26T08:02:24.122+05:30",
 "rep:policy": {
  "jcr:primaryType": "nam:rep:ACL",
  "allow": {
   "jcr:primaryType": "nam:rep:GrantACE",
   "rep:principalName": "snapshotservice",
   "rep:privileges": [
    "nam:jcr:read",
    "nam:rep:write"
   ]
  }
 }
}
/var/reports> export -h
usage: export-nodes [-h] [-p <repo_path_to_export>] [-o <dir_name>]
Export nodes and its children as json
 -b,--blobs                   Serialize blob contents also
 -c,--console                 Output to console
 -d,--depth <arg>             Maximum tree depth to write out. Default to
                              all
 -f,--filter <arg>            Filter for nodes and properties to include
                              in json format. Default {"properties":["*",
                              "-:childOrder"]}
 -h,--help                    Print usage
 -n,--max-child-nodes <arg>   maximum number of child nodes to include
 -o,--out <out>               Directory name to store json and blobs
                              (default: .)
 -p,--path <path>             Repository path to export (default: current
                              node)

{noformat}


was (Author: chetanm):
Done the implemenation in 1804763- 1804770

Implementation has following support

* Supports exporting NodeState in json and cnd format
* Export can be done via explicit {{export}} command and a groovy console 
command
* Support serializing blobs in FileDataStore storage i.e. blobs would be stored 
in a local FDS
* Blob serialization can skip problamatic binaries by writing a marker blobId. 
Such blobs would fail on deserialize and marked as "*ERROR*-<blob id>" in the 
serialized form
* json is written in a streaming way so supports serializing large tree

*Export Command*

Refer to [Oak Run NodeStore 
Connection|https://jackrabbit.apache.org/oak/docs/features/oak-run-nodestore-connection-options.html]
 for details on how to connect to various NodeStore and BlobStore

{noformat}
$ java -jar oak-run-*.jar export -p /path/in/repo /path/of/segmentstore -o 
/path/of/output/dir
$ java -jar oak-run-*.jar export -h
Exports NodeState as json                                 


The export command supports exporting nodes from a repository in json. It also 
provide options to export the blobs
  which are stored in FileDataStore format                                      
                                  

Option                           Description                                    
                                   
------                           -----------                                    
                                   
-b, --blobs [Boolean]            Export blobs also. By default blobs are not 
exported (default: false)             
-d, --depth [Integer]            Max depth to include in output (default: 
2147483647)                              
-f, --filter <String>            Filter expression as json to filter out which 
nodes and properties are included in
                                   exported file (default: {"properties":["*", 
"-:childOrder"]})                   
--filter-file <File>             Filter file which contains the filter json 
expression                             
--format <String>                Export format 'json' or 'txt' (default: json)  
                                   
-n, --max-child-nodes [Integer]  Maximum number of child nodes to include for a 
any parent (default: 2147483647)   
-o, --out <File>                 Output directory where the exported json and 
blobs are stored (default: .)        
-p, --path <String>              Repository path to export (default: /)         
                                   
--pretty [Boolean]               Pretty print the json output (default: true)   
{noformat}

*Export in Groovy Console*
{noformat}
$  java -jar oak-run-*.jar console /path/of/segmentstore
Apache Jackrabbit Oak 1.8-SNAPSHOT
Repository connected in read-only mode. Use '--read-write' for write operations
Jackrabbit Oak Shell (Apache Jackrabbit Oak 1.8-SNAPSHOT, JVM: 1.8.0_66)
Type ':help' or ':h' for help.
----------------------------------------------------------------------------------------------------------------------------
/> cd /var/reports
/var/reports> export -c
{
 "jcr:primaryType": "nam:sling:Folder",
 "jcr:mixinTypes": [
  "nam:rep:AccessControllable"
 ],
 "jcr:createdBy": "admin",
 "jcr:created": "dat:2017-01-26T08:02:24.122+05:30",
 "rep:policy": {
  "jcr:primaryType": "nam:rep:ACL",
  "allow": {
   "jcr:primaryType": "nam:rep:GrantACE",
   "rep:principalName": "snapshotservice",
   "rep:privileges": [
    "nam:jcr:read",
    "nam:rep:write"
   ]
  }
 }
}
/var/reports> export -h
usage: export-nodes [-h] [-p <repo_path_to_export>] [-o <dir_name>]
Export nodes and its children as json
 -b,--blobs                   Serialize blob contents also
 -c,--console                 Output to console
 -d,--depth <arg>             Maximum tree depth to write out. Default to
                              all
 -f,--filter <arg>            Filter for nodes and properties to include
                              in json format. Default {"properties":["*",
                              "-:childOrder"]}
 -h,--help                    Print usage
 -n,--max-child-nodes <arg>   maximum number of child nodes to include
 -o,--out <out>               Directory name to store json and blobs
                              (default: .)
 -p,--path <path>             Repository path to export (default: current
                              node)

{noformat}

> Tooling to serialize NodeState as json along with blobs
> -------------------------------------------------------
>
>                 Key: OAK-6545
>                 URL: https://issues.apache.org/jira/browse/OAK-6545
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: run
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8
>
>
> For debugging certain cases like OAK-6525 we need a way to analyze the hidden 
> NodeState structure used by indexes. To simplify the effort I would like to 
> add some tooling to oak-run which allows dumping the NodeState and its 
> children for certain path along with the blob contents



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to