[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-02-16 Thread perry he (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149904#comment-15149904
 ] 

perry he commented on AXIOM-478:


Hi Andreas, please kick off 1.2.18 release if no more concern, thanks! We will 
pick it up as formal build.

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-02-16 Thread LU Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149702#comment-15149702
 ] 

LU Jie commented on AXIOM-478:
--

Hi Andreas, 1.2.18-SNAPSHOT does fix the second issue I reported on 19/Jan. 

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-02-16 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149234#comment-15149234
 ] 

Andreas Veithen commented on AXIOM-478:
---

> LU Jie added a comment on 13/Jan/16 to confirm your previous fix works, 
> thanks!

No, her comment on 13/Jan/16 was in response to advice I gave that works with 
the existing 1.2.17 release. I was requesting feedback on the issue she 
described on 19/Jan/16 and which is fixed in 1.2.18-SNAPSHOT.

> But seems there is another issue(AXIOM-288) in this thread on 19/Jan/16, I am 
> not sure about this. If you are ok with AXIOM-288, I think we are ok to go 
> for a new release.

No, AXIOM-288 was already implemented in 1.2.15. The problem that will be fixed 
by the upcoming 1.2.18 release is that the getTextAsStream method didn't use 
the enhancement introduced by AXIOM-288 in the correct way.

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-02-16 Thread perry he (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148376#comment-15148376
 ] 

perry he commented on AXIOM-478:


LU Jie added a comment on 13/Jan/16 to confirm your previous fix works, thanks!

But seems there is another issue(AXIOM-288) in this thread on 19/Jan/16, I am 
not sure about this. If you are ok with AXIOM-288, I think we are ok to go for 
a new release.

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-02-16 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148352#comment-15148352
 ] 

Andreas Veithen commented on AXIOM-478:
---

If you can confirm that 1.2.18-SNAPSHOT fixes the problem and that this issue 
can be closed, then I can kick off a release.

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-02-15 Thread perry he (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148224#comment-15148224
 ] 

perry he commented on AXIOM-478:


Hi Andreas, do you know when 1.2.18 would be released? LuJie reported this 
issue and we would like to pick up the fix in formal build, thanks!

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-02-13 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146205#comment-15146205
 ] 

Andreas Veithen commented on AXIOM-478:
---

Ping. Did you test with 1.2.18-SNAPSHOT?

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-19 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107428#comment-15107428
 ] 

Andreas Veithen commented on AXIOM-478:
---

There is indeed a bug. The problem is that the code that implements 
getTextAsStream() wasn't updated to leverage the improvement implemented in 
AXIOM-288. I'm working on a fix now.

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-19 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107557#comment-15107557
 ] 

Andreas Veithen commented on AXIOM-478:
---

A snapshot version (1.2.18-SNAPSHOT) containing the fix is available. You can 
pull it from the Maven snapshot repository at 
https://repository.apache.org/content/repositories/snapshots/ or download the 
binary distribution at 
https://builds.apache.org/job/axiom-trunk/lastStableBuild/org.apache.ws.commons.axiom$distribution/.
 Note that for this to work it is mandatory to close the Reader returned by 
getTextAsStream before accessing other nodes in the tree.

Please let me know if this solves the issue.

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107485#comment-15107485
 ] 

Hudson commented on AXIOM-478:
--

SUCCESS: Integrated in axiom-trunk #2426 (See 
[https://builds.apache.org/job/axiom-trunk/2426/])
AXIOM-478:
* Allow OMElement#getTextAsStream to use the feature implemented in AXIOM-288 
so that caching will be reenabled after the stream is closed.
* Update the Javadoc to document the requirement to close the Reader.
* Throw a meaningful exception when the next() method is invoked on a builder 
with caching disabled. (veithen: rev 1725612)
* 
axiom/aspects/om-aspects/src/main/java/org/apache/axiom/om/impl/common/AxiomElementSupport.aj
* axiom/axiom-api/src/main/java/org/apache/axiom/om/OMElement.java
* 
axiom/axiom-api/src/main/java/org/apache/axiom/om/impl/builder/StAXOMBuilder.java
* 
axiom/testing/axiom-testsuite/src/main/java/org/apache/axiom/ts/om/element/TestGetTextAsStreamWithoutCaching.java


> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-12 Thread LU Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095463#comment-15095463
 ] 

LU Jie commented on AXIOM-478:
--

Hi Andreas, it worked! Thank you so much for the help!

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-08 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089009#comment-15089009
 ] 

Andreas Veithen commented on AXIOM-478:
---

You need to configure the underlying parser in non coalescing mode, i.e. you 
need to use one of the createOMBuilder variants that takes a 
StAXParserConfiguration argument and pass 
StAXParserConfiguration.NON_COALESCING to it. Note that this is known to work 
with Woodstox and the StAX implementation in the JRE, but not with others (see 
http://veithen.github.io/2013/10/11/broken-by-design-xlxp2.html).

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-06 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085270#comment-15085270
 ] 

Andreas Veithen commented on AXIOM-478:
---

OMElement has a getTextAsStream method that does what you are looking for. 
Please read the Javadoc carefully to understand how to use it in such a way as 
to avoid building the entire base64 encoded content in memory.

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-06 Thread LU Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085183#comment-15085183
 ] 

LU Jie commented on AXIOM-478:
--

Hi Andreas,

Answers:
1. Currently we use abdera 0.4.0 with axiom 1.2.5. 
2. We didn't use IBM XLXP. What we have is the StAX with abdera 0.4.0 project.

More details:
I tested the axiom API with test code. I tried to parse an atom with 40MB 
content. It allocated arount 260MB objects in memory. To fix this issue, I have 
to impl a parser to parse CMIS atom with pull-parser as axiom user guide 
mentioned. 

Now we are updating our dependency to abdera 1.1.3 with axiom 1.2.16. Out plan 
is to parse the base64 content and decode as stream, then serialize it to a 
file in storage to avoid loading whole file to memory. I DO know I can use 
XMLStreamReaderUtils.getElementTextAsStream. But We prefer to use axiom API 
instead of pull-parser to avoid writing and maintaining a parser by ourselves. 
So I need help to clarify if the latest axiom element interface has this 
capability. Or would you point me if there's alternative solution?

Thank you in advance for the help!

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-06 Thread LU Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086795#comment-15086795
 ] 

LU Jie commented on AXIOM-478:
--

I tested getTextAsStream() but it still consume large memory. Here's the test 
code:

InputStream is = getClass().getResourceAsStream("sample2.txt");
OMXMLParserWrapper builder = 
OMXMLBuilderFactory.createOMBuilder(is);
OMElement document = builder.getDocumentElement();

Iterator entryIt = document.getChildElements();
Iterator contentIt = null;
OMElement contentElement = null;
OMElement base64Element = null;
Reader reader = null;

while (entryIt.hasNext()) {
contentElement = entryIt.next();
if ("content".equals(contentElement.getLocalName())) {
contentIt = contentElement.getChildElements();
while (contentIt.hasNext()) {
base64Element = contentIt.next();
if 
("base64".equals(base64Element.getLocalName())) {
reader = 
base64Element.getTextAsStream(false);
int byteCount = 0;
for (int character = 
reader.read(); character != -1; character = reader.read()) {
byteCount++;
if (byteCount == 1) {

System.out.println("breakpoint");
}
if (byteCount == 
24553800) {

System.out.println("breakpoint");
}
}
System.out.println("Base 64 
Encoded Bytes: " + byteCount);
}
}
}
}

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  

[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-06 Thread LU Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086796#comment-15086796
 ] 

LU Jie commented on AXIOM-478:
--

It allocate large memory after this line:
reader = base64Element.getTextAsStream(false);

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org



[jira] [Commented] (AXIOM-478) Solution for parsing large XML

2016-01-05 Thread Andreas Veithen (JIRA)

[ 
https://issues.apache.org/jira/browse/AXIOM-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082606#comment-15082606
 ] 

Andreas Veithen commented on AXIOM-478:
---

Some questions:

* Are the extension elements for CMIS implemented using custom code, or are you 
using an existing library?
* What are your constraints regarding the StAX implementation? In particular, 
does the solution have to work with IBM's XLXP parser?

> Solution for parsing large XML
> --
>
> Key: AXIOM-478
> URL: https://issues.apache.org/jira/browse/AXIOM-478
> Project: Axiom
>  Issue Type: Question
>Reporter: LU Jie
>
> This is LU Jie from IBM. We use axiom to parse Atom in our project. 
> One of our CMIS API will attach file content to the XML. If the file size is 
> large, we will get a large atom.
> If we use Entry.getExtension(QName) to parse the content, it will allocate a 
> large memory(around 5-6 times of the file size).
> We need you help to clarify if we can use DOM-like API of axiom to get the 
> text of a certain element as stream. That is without allocating a large 
> object in memory.
> Or is there an alternative solution for this use case?
> We DO know that we can use pull-parser to parse the XML as stream. But we 
> need help to investigate if axiom has already provided an API or solution to 
> avoid writing parser by ourselves.
> Here's the sample XML. We need to parse the text of cmisra:base64 element:
> {noformat}
>  xmlns:atom="http://www.w3.org/2005/Atom;
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;
> xmlns:chemistry="http://chemistry.apache.org/;
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  
> xmlns:atom="http://www.w3.org/2005/Atom;>urn:uuid:----000
> 
>  xmlns:atom="http://www.w3.org/2005/Atom; 
> type="text">doucment1446016556658.txt
> 
>  xmlns:atom="http://www.w3.org/2005/Atom;>2015-10-28T07:15:57.594Z
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>text/plain
> 
>  
> xmlns:chemistry="http://chemistry.apache.org/;>doucment1446016556658.txt
> 
>  
> xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>Base64 
> encoded content of large file
> 
> 
>  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:objectTypeId">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>snx:file
> 
> 
>  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/; 
> propertyDefinitionId="cmis:name">
>  
> xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/;>doucment1446016556658.txt
> 
> 
> 
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org