[jira] [Commented] (XERCESC-2241) Integer overflows in DFAContentModel class

2022-10-02 Thread Even Rouault (Jira)


[ 
https://issues.apache.org/jira/browse/XERCESC-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17612153#comment-17612153
 ] 

Even Rouault commented on XERCESC-2241:
---

Fix in https://github.com/apache/xerces-c/pull/51

> Integer overflows in DFAContentModel class
> --
>
> Key: XERCESC-2241
> URL: https://issues.apache.org/jira/browse/XERCESC-2241
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: Validating Parser (XML Schema)
>Reporter: Even Rouault
>Priority: Major
>
> On .xsd files like the following ones (generated by ossfuzz, so broken), 
> integer overflows can happen in DFAContentModel::countLeafNodes() and 
> DFAContentModel::buildDFA() which can later cause out-of-bounds access.
> Found in [https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=52025]
>  
> ```
> http://www.w3.org/2001/XMLSchema;
>            xmlns:myns="http://myns;
>            targetNamespace="http://myns;
>            elementFormDefault="qualified" attributeFormDefault="unqualified">
> 
>   
>      
>         
>       
>   
> 
> 
>   
>       
>       
>         
>             
>  ame="x" type="xs:int" maxOccurs="1"/>
>             
>         
>       
>   
> 
> 
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2241) Integer overflows in DFAContentModel class

2022-10-02 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2241:
-

 Summary: Integer overflows in DFAContentModel class
 Key: XERCESC-2241
 URL: https://issues.apache.org/jira/browse/XERCESC-2241
 Project: Xerces-C++
  Issue Type: Bug
  Components: Validating Parser (XML Schema)
Reporter: Even Rouault


On .xsd files like the following ones (generated by ossfuzz, so broken), 
integer overflows can happen in DFAContentModel::countLeafNodes() and 
DFAContentModel::buildDFA() which can later cause out-of-bounds access.

Found in [https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=52025]

 

```

http://www.w3.org/2001/XMLSchema;
           xmlns:myns="http://myns;
           targetNamespace="http://myns;
           elementFormDefault="qualified" attributeFormDefault="unqualified">


  
     
        
      
  



  
      
      
        
            
 ame="x" type="xs:int" maxOccurs="1"/>
            
        
      
  




```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Commented] (XERCESC-2188) Use-after-free on external DTD scan

2022-01-23 Thread Even Rouault (Jira)


[ 
https://issues.apache.org/jira/browse/XERCESC-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480656#comment-17480656
 ] 

Even Rouault commented on XERCESC-2188:
---

My attempt at fixing the issue in https://github.com/apache/xerces-c/pull/47

> Use-after-free on external DTD scan
> ---
>
> Key: XERCESC-2188
> URL: https://issues.apache.org/jira/browse/XERCESC-2188
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: Validating Parser (DTD)
>Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.1.3, 
> 3.1.4, 3.2.1, 3.2.2
>Reporter: Scott Cantor
>Priority: Major
> Attachments: Apache-496067-disclosure-report.pdf
>
>
> This is a record of an unfixed bug reported in 2018 in the DTD scanner, per 
> the attached PDF, corresponding to CVE-2018-1311.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2235) DFAContentModel::buildDFA(): correctly zero-initialize fFollowList

2021-12-20 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2235:
-

 Summary: DFAContentModel::buildDFA(): correctly zero-initialize 
fFollowList
 Key: XERCESC-2235
 URL: https://issues.apache.org/jira/browse/XERCESC-2235
 Project: Xerces-C++
  Issue Type: Bug
Reporter: Even Rouault


Due to a copy issue, the intended zero-initialization of
fFollowList wasn't done (copy issue), and thus in case of
OutOfMemory exception when initializing the array, the memory freeing in
cleanup() could access uninitialized elements.

Follow-up of https://github.com/apache/xerces-c/pull/40 / 
a65990d79d3fc333d7481f010da4e165a88b6cb3

Fixes GDAL's https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=42636



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2233) DFAContentModel::buildDFA(): fix memory leaks when OutOfMemoryException occurs

2021-12-04 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2233:
-

 Summary: DFAContentModel::buildDFA(): fix memory leaks when 
OutOfMemoryException occurs
 Key: XERCESC-2233
 URL: https://issues.apache.org/jira/browse/XERCESC-2233
 Project: Xerces-C++
  Issue Type: Bug
Reporter: Even Rouault


Fixes GDAL's [https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=41335]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2230) DFAContentModel::buildSyntaxTree(): fix memory leaks when OutOfMemoryException occurs

2021-11-15 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2230:
-

 Summary: DFAContentModel::buildSyntaxTree(): fix memory leaks when 
OutOfMemoryException occurs
 Key: XERCESC-2230
 URL: https://issues.apache.org/jira/browse/XERCESC-2230
 Project: Xerces-C++
  Issue Type: Bug
Reporter: Even Rouault


Fixes GDAL's [https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40866]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2229) IGXMLScanner::scanDocTypeDecl(): fix memory leak on exception

2021-10-28 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2229:
-

 Summary: IGXMLScanner::scanDocTypeDecl(): fix memory leak on 
exception
 Key: XERCESC-2229
 URL: https://issues.apache.org/jira/browse/XERCESC-2229
 Project: Xerces-C++
  Issue Type: Improvement
Reporter: Even Rouault


The method can leak pubId and sysId when subsequent call to
fReaderMgr.skipPastSpaces() throws an exception (e.g. a
TranscodingException)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2228) DFAContentModel: fix memory leaks when OutOfMemoryException occurs

2021-09-23 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2228:
-

 Summary: DFAContentModel: fix memory leaks when 
OutOfMemoryException occurs
 Key: XERCESC-2228
 URL: https://issues.apache.org/jira/browse/XERCESC-2228
 Project: Xerces-C++
  Issue Type: Improvement
Reporter: Even Rouault


Fixes GDAL's [https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39159]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2227) Memleak fixes in ContentSpecNode and ComplexTypeInfo classes

2021-09-22 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2227:
-

 Summary: Memleak fixes in ContentSpecNode and ComplexTypeInfo 
classes
 Key: XERCESC-2227
 URL: https://issues.apache.org/jira/browse/XERCESC-2227
 Project: Xerces-C++
  Issue Type: Improvement
Reporter: Even Rouault


when a OutOfMemory exception occurs.

Spotted by [https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39127] (on 
GDAL)

The commits are a bit in increasing order of triviality. The ownership rules of 
ContentSpecNode first and second members, as used by ComplexTypeInfo, are super 
complex. shared_ptr would be much welcome here! I can just tell that valgrind 
on my test case reports no double-free nor memory leak after those fixes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2224) DFAContentModel::checkUniqueParticleAttribution (): speed enhancement

2021-09-20 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2224:
-

 Summary: DFAContentModel::checkUniqueParticleAttribution (): speed 
enhancement
 Key: XERCESC-2224
 URL: https://issues.apache.org/jira/browse/XERCESC-2224
 Project: Xerces-C++
  Issue Type: Improvement
Reporter: Even Rouault


The complexity of this method is roughly O(n^3). Fuzzers can generate
schemas with n = several thousands. The test fTransTable[i][j] == 
XMLContentModel::gInvalidTrans
is independant of the k loop, and can thus being moved at a upper level
to improve runtime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2223) SAX2XMLReaderImpl::error(): potential memory leak

2021-09-15 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2223:
-

 Summary: SAX2XMLReaderImpl::error(): potential memory leak
 Key: XERCESC-2223
 URL: https://issues.apache.org/jira/browse/XERCESC-2223
 Project: Xerces-C++
  Issue Type: Bug
Reporter: Even Rouault


SAX2XMLReaderImpl::error() uses the regular memory manager to create the 
SAXParseException. It might fail to fully initialize the object, and 
potentially throw an exception when building it, causing it to leak a bit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2222) DFAContentModel::checkUniqueParticleAttribution(): fix memory leak

2021-09-11 Thread Even Rouault (Jira)
Even Rouault created XERCESC-:
-

 Summary: DFAContentModel::checkUniqueParticleAttribution(): fix 
memory leak
 Key: XERCESC-
 URL: https://issues.apache.org/jira/browse/XERCESC-
 Project: Xerces-C++
  Issue Type: Bug
  Components: Validating Parser (XML Schema)
Reporter: Even Rouault


If a memory allocation of conflictTable[] fails, or later in the
 function, the array is not freed.
 Fixes [https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38533]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Created] (XERCESC-2221) InMemMsgLoader::loadMsg(): fix memory leak when transcoding fails

2021-08-26 Thread Even Rouault (Jira)
Even Rouault created XERCESC-2221:
-

 Summary: InMemMsgLoader::loadMsg(): fix memory leak when 
transcoding fails
 Key: XERCESC-2221
 URL: https://issues.apache.org/jira/browse/XERCESC-2221
 Project: Xerces-C++
  Issue Type: Bug
  Components: Utilities
Affects Versions: 3.2.3
Reporter: Even Rouault


Seen with the IconvGNU transcoder when parsing "

[jira] [Updated] (XERCESC-2221) InMemMsgLoader::loadMsg(): fix memory leak when transcoding fails

2021-08-26 Thread Even Rouault (Jira)


 [ 
https://issues.apache.org/jira/browse/XERCESC-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Even Rouault updated XERCESC-2221:
--
Description: 
Seen with the IconvGNU transcoder when parsing " InMemMsgLoader::loadMsg(): fix memory leak when transcoding fails
> -
>
> Key: XERCESC-2221
> URL: https://issues.apache.org/jira/browse/XERCESC-2221
> Project: Xerces-C++
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 3.2.3
>Reporter: Even Rouault
>Priority: Major
>
> Seen with the IconvGNU transcoder when parsing "  The reason is that XMLString::transcode(repText2, manager) throws a 
> TranscodingException
>  which causes the tmp1 string to leak.
> {noformat}
> 0 0x8791409 in operator new(unsigned int) 
> /src/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:99:3
> 1 0xbd147f7 in xercesc_4_0::MemoryManagerImpl::allocate(unsigned int) 
> gdal/xerces-c/src/xercesc/internal/MemoryManagerImpl.cpp:40:18
> 2 0xbe8c73e in xercesc_4_0::IconvGNULCPTranscoder::transcode(char const*, 
> xercesc_4_0::MemoryManager*) 
> gdal/xerces-c/src/xercesc/util/Transcoders/IconvGNU/IconvGNUTransService.cpp:870:32
> 3 0xbc22ca2 in xercesc_4_0::XMLString::transcode(char const*, 
> xercesc_4_0::MemoryManager*) 
> gdal/xerces-c/src/xercesc/util/XMLString.cpp:621:25
> 4 0xbe8f4ad in xercesc_4_0::InMemMsgLoader::loadMsg(unsigned int, char16_t*, 
> unsigned int, char const*, char const*, char const*, char const*, 
> xercesc_4_0::MemoryManager*) 
> gdal/xerces-c/src/xercesc/util/MsgLoaders/InMemory/InMemMsgLoader.cpp:157:16
> 5 0xbc20175 in 
> xercesc_4_0::XMLException::loadExceptText(xercesc_4_0::XMLExcepts::Codes, 
> char const*, char const*, char const*, char const*) 
> gdal/xerces-c/src/xercesc/util/XMLException.cpp:241:23
> 6 0xbc48bee in 
> xercesc_4_0::UTFDataFormatException::UTFDataFormatException(char const*, 
> unsigned long long, xercesc_4_0::XMLExcepts::Codes, char const*, char const*, 
> char const*, char const*, xercesc_4_0::MemoryManager*) 
> gdal/xerces-c/src/xercesc/util/UTFDataFormatException.hpp:31:1
> 7 0xbc4824e in xercesc_4_0::XMLUTF8Transcoder::transcodeFrom(unsigned char 
> const*, unsigned int, char16_t*, unsigned int, unsigned int&, unsigned char*) 
> gdal/xerces-c/src/xercesc/util/XMLUTF8Transcoder.cpp:182:13
> 8 0xbd27d7e in xercesc_4_0::XMLReader::xcodeMoreChars(char16_t*, unsigned 
> char*, unsigned int) gdal/xerces-c/src/xercesc/internal/XMLReader.cpp:1926:34
> 9 0xbd271dd in xercesc_4_0::XMLReader::refreshCharBuffer() 
> gdal/xerces-c/src/xercesc/internal/XMLReader.cpp:571:19
> 10 0xbd15c63 in xercesc_4_0::XMLReader::peekNextChar(char16_t&) 
> gdal/xerces-c/src/xercesc/internal/XMLReader.hpp:767:14
> 11 0xbd15aaf in xercesc_4_0::ReaderMgr::peekNextChar() 
> gdal/xerces-c/src/xercesc/internal/ReaderMgr.cpp:158:21
> 12 0xbd328da in xercesc_4_0::XMLScanner::scanProlog() 
> gdal/xerces-c/src/xercesc/internal/XMLScanner.cpp:1241:45
> 13 0xbd31ef4 in xercesc_4_0::XMLScanner::scanFirst(xercesc_4_0::InputSource 
> const&, xercesc_4_0::XMLPScanToken&) 
> gdal/xerces-c/src/xercesc/internal/XMLScanner.cpp:549:9
> 14 0xbdadcff in 
> xercesc_4_0::SAX2XMLReaderImpl::parseFirst(xercesc_4_0::InputSource const&, 
> xercesc_4_0::XMLPScanToken&) 
> gdal/xerces-c/src/xercesc/parsers/SAX2XMLReaderImpl.cpp:500:22  {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



[jira] [Updated] (XERCESC-2094) Memory leak related to invalid encoding

2017-05-19 Thread Even Rouault (JIRA)

 [ 
https://issues.apache.org/jira/browse/XERCESC-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Even Rouault updated XERCESC-2094:
--
Description: 
Issue originally found through OSS-Fuzz on GDAL ( for reference 
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=1685 : the link will not 
be publicly accessible until 90 days have passe), but can be reproduced with 
Xerces-C SAX2Count utility.

On the attached file, Valgrind reports a memory leak:

The content of the file is:
{{{
http://schemas.opengis.net/gml"/>
}}}

valgrind --leak-check=full /home/even/install-xerces-c-3.1.4/bin/SAX2Count 
xerces-c-leak.xml 
==21268== Memcheck, a memory error detector
==21268== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==21268== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==21268== Command: /home/even/install-xerces-c-3.1.4/bin/SAX2Count 
/home/even/gdal/trunk/gdal/xerces-c-leak.xml
==21268== 

Fatal Error at file /home/even/gdal/trunk/gdal/xerces-c-leak.xml, line 1, char 
35
  Message: unable to create converter for 'U' encoding
==21268== 
==21268== HEAP SUMMARY:
==21268== in use at exit: 76,348 bytes in 10 blocks
==21268==   total heap usage: 9,244 allocs, 9,234 frees, 1,282,907 bytes 
allocated
==21268== 
==21268== 52 (40 direct, 12 indirect) bytes in 1 blocks are definitely lost in 
loss record 4 of 10
==21268==at 0x4C2E0EF: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21268==by 0x4FF9B58: xercesc_3_1::MemoryManagerImpl::allocate(unsigned 
long) (MemoryManagerImpl.cpp:40)
==21268==by 0x4F7EE05: xercesc_3_1::XMemory::operator new(unsigned long, 
xercesc_3_1::MemoryManager*) (XMemory.cpp:68)
==21268==by 0x4F7E660: 
xercesc_3_1::ENameMapFor::makeNew(unsigned 
long, xercesc_3_1::MemoryManager*) const (TransENameMap.c:50)
==21268==by 0x4F7AF20: 
xercesc_3_1::XMLTransService::makeNewTranscoderFor(unsigned short const*, 
xercesc_3_1::XMLTransService::Codes&, unsigned long, 
xercesc_3_1::MemoryManager*) (TransService.cpp:147)
==21268==by 0x5010A75: xercesc_3_1::XMLReader::refreshCharBuffer() 
(XMLReader.cpp:523)
==21268==by 0x4FFA5AA: peekNextChar (XMLReader.hpp:767)
==21268==by 0x4FFA5AA: xercesc_3_1::ReaderMgr::peekNextChar() 
(ReaderMgr.cpp:158)
==21268==by 0x5016297: xercesc_3_1::XMLScanner::scanProlog() 
(XMLScanner.cpp:1238)
==21268==by 0x4FEE371: 
xercesc_3_1::IGXMLScanner::scanDocument(xercesc_3_1::InputSource const&) 
(IGXMLScanner.cpp:206)
==21268==by 0x5017E6D: xercesc_3_1::XMLScanner::scanDocument(unsigned short 
const*) (XMLScanner.cpp:400)
==21268==by 0x5018221: xercesc_3_1::XMLScanner::scanDocument(char const*) 
(XMLScanner.cpp:408)
==21268==by 0x5044F47: xercesc_3_1::SAX2XMLReaderImpl::parse(char const*) 
(SAX2XMLReaderImpl.cpp:451)
==21268== 
==21268== LEAK SUMMARY:
==21268==definitely lost: 40 bytes in 1 blocks
==21268==indirectly lost: 12 bytes in 1 blocks
==21268==  possibly lost: 0 bytes in 0 blocks
==21268==still reachable: 76,296 bytes in 8 blocks
==21268== suppressed: 0 bytes in 0 blocks
==21268== Reachable blocks (those to which a pointer was found) are not shown.
==21268== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==21268== 
==21268== For counts of detected and suppressed errors, rerun with: -v
==21268== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

I've found that the leak occurs only if the following conditions are met: there 
is a newline character between https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=1685 : the link will not 
be publicly accessible until 90 days have passe), but can be reproduced with 
Xerces-C SAX2Count utility.

On the attached file, Valgrind reports a memory leak:

The content of the file is:
{{{
http://schemas.opengis.net/gml"/>
}}}

valgrind --leak-check=full /home/even/install-xerces-c-3.1.4/bin/SAX2Count 
xerces-c-leak.xml 
==21268== Memcheck, a memory error detector
==21268== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==21268== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==21268== Command: /home/even/install-xerces-c-3.1.4/bin/SAX2Count 
/home/even/gdal/trunk/gdal/xerces-c-leak.xml
==21268== 

Fatal Error at file /home/even/gdal/trunk/gdal/xerces-c-leak.xml, line 1, char 
35
  Message: unable to create converter for 'U' encoding
==21268== 
==21268== HEAP SUMMARY:
==21268== in use at exit: 76,348 bytes in 10 blocks
==21268==   total heap usage: 9,244 allocs, 9,234 frees, 1,282,907 bytes 
allocated
==21268== 
==21268== 52 (40 direct, 12 indirect) bytes in 1 blocks are definitely lost in 
loss record 4 of 10
==21268==at 0x4C2E0EF: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21268==by 0x4FF9B58: xercesc_3_1::MemoryManagerImpl::allocate(unsigned 
long) (MemoryManagerImpl.cpp:40)
==21268==

[jira] [Created] (XERCESC-2094) Memory leak related to invalid encoding

2017-05-19 Thread Even Rouault (JIRA)
Even Rouault created XERCESC-2094:
-

 Summary: Memory leak related to invalid encoding
 Key: XERCESC-2094
 URL: https://issues.apache.org/jira/browse/XERCESC-2094
 Project: Xerces-C++
  Issue Type: Bug
Affects Versions: 3.1.4, 3.1.3
 Environment: Probably all. In that case Ubuntu 16.04 x86_64
Reporter: Even Rouault
 Attachments: xerces-c-leak.xml

Issue originally found through OSS-Fuzz on GDAL ( for reference 
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=1685 : the link will not 
be publicly accessible until 90 days have passe), but can be reproduced with 
Xerces-C SAX2Count utility.

On the attached file, Valgrind reports a memory leak:

The content of the file is:
{{{
http://schemas.opengis.net/gml"/>
}}}

valgrind --leak-check=full /home/even/install-xerces-c-3.1.4/bin/SAX2Count 
xerces-c-leak.xml 
==21268== Memcheck, a memory error detector
==21268== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==21268== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==21268== Command: /home/even/install-xerces-c-3.1.4/bin/SAX2Count 
/home/even/gdal/trunk/gdal/xerces-c-leak.xml
==21268== 

Fatal Error at file /home/even/gdal/trunk/gdal/xerces-c-leak.xml, line 1, char 
35
  Message: unable to create converter for 'U' encoding
==21268== 
==21268== HEAP SUMMARY:
==21268== in use at exit: 76,348 bytes in 10 blocks
==21268==   total heap usage: 9,244 allocs, 9,234 frees, 1,282,907 bytes 
allocated
==21268== 
==21268== 52 (40 direct, 12 indirect) bytes in 1 blocks are definitely lost in 
loss record 4 of 10
==21268==at 0x4C2E0EF: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21268==by 0x4FF9B58: xercesc_3_1::MemoryManagerImpl::allocate(unsigned 
long) (MemoryManagerImpl.cpp:40)
==21268==by 0x4F7EE05: xercesc_3_1::XMemory::operator new(unsigned long, 
xercesc_3_1::MemoryManager*) (XMemory.cpp:68)
==21268==by 0x4F7E660: 
xercesc_3_1::ENameMapFor::makeNew(unsigned 
long, xercesc_3_1::MemoryManager*) const (TransENameMap.c:50)
==21268==by 0x4F7AF20: 
xercesc_3_1::XMLTransService::makeNewTranscoderFor(unsigned short const*, 
xercesc_3_1::XMLTransService::Codes&, unsigned long, 
xercesc_3_1::MemoryManager*) (TransService.cpp:147)
==21268==by 0x5010A75: xercesc_3_1::XMLReader::refreshCharBuffer() 
(XMLReader.cpp:523)
==21268==by 0x4FFA5AA: peekNextChar (XMLReader.hpp:767)
==21268==by 0x4FFA5AA: xercesc_3_1::ReaderMgr::peekNextChar() 
(ReaderMgr.cpp:158)
==21268==by 0x5016297: xercesc_3_1::XMLScanner::scanProlog() 
(XMLScanner.cpp:1238)
==21268==by 0x4FEE371: 
xercesc_3_1::IGXMLScanner::scanDocument(xercesc_3_1::InputSource const&) 
(IGXMLScanner.cpp:206)
==21268==by 0x5017E6D: xercesc_3_1::XMLScanner::scanDocument(unsigned short 
const*) (XMLScanner.cpp:400)
==21268==by 0x5018221: xercesc_3_1::XMLScanner::scanDocument(char const*) 
(XMLScanner.cpp:408)
==21268==by 0x5044F47: xercesc_3_1::SAX2XMLReaderImpl::parse(char const*) 
(SAX2XMLReaderImpl.cpp:451)
==21268== 
==21268== LEAK SUMMARY:
==21268==definitely lost: 40 bytes in 1 blocks
==21268==indirectly lost: 12 bytes in 1 blocks
==21268==  possibly lost: 0 bytes in 0 blocks
==21268==still reachable: 76,296 bytes in 8 blocks
==21268== suppressed: 0 bytes in 0 blocks
==21268== Reachable blocks (those to which a pointer was found) are not shown.
==21268== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==21268== 
==21268== For counts of detected and suppressed errors, rerun with: -v
==21268== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

I've found that the leak occurs only if the following conditions are met: there 
is a newline character between