Re: [REVIEW 3-5] fdo#47644 performance regression on largish .doc
On Thu, 2012-05-10 at 21:56 +0100, Caolán McNamara wrote: with a scenario of nBgn = 11, nRel = 2, desired result is 13 Ah ! how stupid of me, I missed that - but this case is almost totally pointless ;-) we are just victims of this optimisation of starting in the middle - which means we have to go lookup where we are; when we actually precisely know our offset from the beginning of the chain/array ( which is nNew/nPageSize ). So - how about the attached diff, hopefully rather easier to review back-port :-) should I knock a version up for -3-5 ? the #ifdef bravery is never triggered if enabled (FWIW) during import / export of the slow .doc anyhow. All the best, Michael. -- michael.me...@suse.com , Pseudo Engineer, itinerant idiot diff --git a/sot/source/sdstor/stgstrms.cxx b/sot/source/sdstor/stgstrms.cxx index 88bc416..67da8ee 100644 --- a/sot/source/sdstor/stgstrms.cxx +++ b/sot/source/sdstor/stgstrms.cxx @@ -334,27 +334,12 @@ void StgStrm::SetEntry( StgDirEntry r ) r.SetDirty(); } -namespace lcl -{ -#if defined(__GXX_EXPERIMENTAL_CXX0X__) || __cplusplus = 201103L -using std::is_sorted; -#else -template typename iter bool is_sorted(iter aStart, iter aEnd) -{ -if (aStart == aEnd) -return true; - -for (iter aNext = aStart + 1; aNext != aEnd; aStart = aNext, ++aNext) -{ -if (*aNext *aStart) -return false; -} - -return true; -} -#endif -} - +/* + * The page chain, is basically a singly linked list of slots each + * point to the next page. Instead of traversing the file structure + * for this each time build a simple flat in-memory vector list + * of pages. + */ bool StgStrm::buildPageChainCache() { if (nSize 0) @@ -370,10 +355,6 @@ bool StgStrm::buildPageChainCache() return false; } -m_bSortedPageChain = lcl::is_sorted(m_aPagesCache.begin(), m_aPagesCache.end()); - -SAL_WARN_IF(!m_bSortedPageChain, sot, unsorted page chain, that's suspicious); - return true; } @@ -403,6 +384,35 @@ sal_Bool StgStrm::Pos2Page( sal_Int32 nBytePos ) nPos = nBytePos; if( nOld == nNew ) return sal_True; + +if (m_aPagesCache.empty() nNew ARBITRARY_LARGE_AMOUNT_OF_PAGES) +{ +SAL_WARN(sot, kicking off large seek helper\n); +buildPageChainCache(); +} + +if (!m_aPagesCache.empty()) +{ +size_t nIdx = nNew / nPageSize; + +#ifdef BRAVE_BUT_UN_NECESSARY +// special case: seek to 1st byte of new, unallocated page +// (in case the file size is a multiple of the page size) +if( nBytePos == nSize !nOffset ) +{ +nIdx--; +nOffset = nPageSize; +} +#endif + +if (nIdx m_aPagesCache.size()) +{ +nPage = m_aPagesCache[ nIdx ]; +//fprintf (stderr, got page ! %d offset %d\n, (int)nPage, (int)nOffset); +return sal_Bool( nPage = 0 ); +} +} + if( nNew nOld ) { // the new position is after the current, so an incremental @@ -420,62 +430,19 @@ sal_Bool StgStrm::Pos2Page( sal_Int32 nBytePos ) // now, traverse the FAT chain. nRel /= nPageSize; -sal_Int32 nLast = STG_EOF; -if (m_aPagesCache.empty() nRel ARBITRARY_LARGE_AMOUNT_OF_PAGES) -{ +sal_Int32 nLast = STG_EOF; while (nRel nBgn = 0) { nLast = nBgn; nBgn = pFat-GetNextPage( nBgn ); nRel--; } -} -else if (nBgn = 0) -{ -//Seeking large distances is slow, so if we're starting seeking (some -//fairly arbitrary) large distances, build a cache and re-use it for -//subsequent seeks -if (m_aPagesCache.empty()) -{ -SAL_WARN(sot, kicking off large seek helper\n); -buildPageChainCache(); -} - -std::vectorsal_Int32::iterator aI; - -if (m_bSortedPageChain) -aI = std::lower_bound(m_aPagesCache.begin(), m_aPagesCache.end(), nBgn); -else -aI = std::find(m_aPagesCache.begin(), m_aPagesCache.end(), nBgn); - -if (aI == m_aPagesCache.end()) -{ -SAL_WARN(sot, Unknown page position); -nBgn = STG_EOF; -} -else -{ -size_t nBgnDistance = std::distance(m_aPagesCache.begin(), aI); - -size_t nIndex = nBgnDistance + nRel; - -if (nIndex m_aPagesCache.size()) -{ -nRel = m_aPagesCache.size() - nBgnDistance; -nIndex = m_aPagesCache.size() - 1; -} -else -nRel = 0; - -nLast = nIndex ? m_aPagesCache[nIndex - 1] : STG_EOF; -nBgn = m_aPagesCache[nIndex]; -} -} // special case: seek to 1st byte of new, unallocated page // (in case the file size is a multiple of the page size)
Re: [REVIEW 3-5] fdo#47644 performance regression on largish .doc
On Fri, 2012-05-11 at 10:09 +0100, Michael Meeks wrote: we actually precisely know our offset from the beginning of the chain/array ( which is nNew/nPageSize ). hmm, that escaped me entirely. Been trying to find a loophole that invalidates that e.g. 10 20 30 40 where 20 is the first entry. But anything like that seems to be covered in that the ::Inits pass the first entry which ends up as nStart and we build the chain only from there. So I can't see any reason it wouldn't work. So - how about the attached diff, hopefully rather easier to review back-port :-) Seems sane, though it probably makes sense to tweak the trigger for calling buildPageChainCache as the units for nNew there are in bytes while the define is in pages. And we might want to avoid calling it in the case of a small relative jump in pages forward. C. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
[REVIEW 3-5] fdo#47644 performance regression on largish .doc
fdo#47644 big 18 meg .doc is super dooper slow to load, mostly because of the extra checks I added to sanity test .doc file during parsing. Turns out that seeking backwards in the those ole2 formats is incredibly slow, this might affect other formats like xls and ppt that use some of the shared msfilter code. 1st attachment gets us back to where we came from in terms of performance there. The second one is maybe a bit more contentious for 3-5, but I include it for a look-over. Basically there's a page chain in the ole2 format and to seek to a location you walk through the chain until you get to the correct page. I see that in practice most documents have their chain in a presorted order (I have no idea if having them unsorted is actually a sign of a broken document or if its legal) so we can do a far faster binary_search if we walk the chain once and hold on to the results. So second attachment keeps the chain if its one of the circumstances where we have to parse the whole thing anyway, and if not (the usual one) then if we're seeking an (fairly arbitrary) large number of steps where its likely that the cost-benefit is in favour of it then generate the chain once and hold on to it for reuse until there any event which might invalidate the chain. Second patch knocks about 50 seconds off load time. First patch knocks some unknown but 10 mins off load time. C. Don't even think about measuring on a gcc dbgutil build btw, the concept checking of the stl lower_bound is a huge cost so the times aren't even roughly uniformly slower on dbgutil vs product but spike dramatically on some stl stuff. From 1e89ba85d89858ae8c3263a4e6b5b522016fe445 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Caol=C3=A1n=20McNamara?= caol...@redhat.com Date: Thu, 10 May 2012 15:41:17 +0100 Subject: [PATCH] Resolves: fdo#47644 compound storage backend is poor at knowing stream size --- sot/inc/sot/stg.hxx |3 +++ sot/inc/sot/storage.hxx |1 + sot/source/sdstor/stg.cxx|7 +++ sot/source/sdstor/storage.cxx|7 +++ sot/source/sdstor/ucbstorage.cxx |5 + tools/inc/tools/stream.hxx |3 ++- tools/source/stream/stream.cxx |6 +++--- 7 files changed, 28 insertions(+), 4 deletions(-) diff --git a/sot/inc/sot/stg.hxx b/sot/inc/sot/stg.hxx index c8d18da..3071017 100644 --- a/sot/inc/sot/stg.hxx +++ b/sot/inc/sot/stg.hxx @@ -90,6 +90,7 @@ public: virtual sal_uLong Tell() = 0; virtual voidFlush() = 0; virtual sal_BoolSetSize( sal_uLong nNewSize ) = 0; +virtual sal_uLong GetSize() const = 0; virtual sal_BoolCopyTo( BaseStorageStream * pDestStm ) = 0; virtual sal_BoolCommit() = 0; virtual sal_BoolRevert() = 0; @@ -171,6 +172,7 @@ public: virtual sal_uLong Tell() { return nPos; } virtual voidFlush(); virtual sal_BoolSetSize( sal_uLong nNewSize ); +virtual sal_uLong GetSize() const; virtual sal_BoolCopyTo( BaseStorageStream * pDestStm ); virtual sal_BoolCommit(); virtual sal_BoolRevert(); @@ -266,6 +268,7 @@ public: virtual sal_uLong Tell(); virtual voidFlush(); virtual sal_BoolSetSize( sal_uLong nNewSize ); +virtual sal_uLong GetSize() const; virtual sal_BoolCopyTo( BaseStorageStream * pDestStm ); virtual sal_BoolCommit(); virtual sal_BoolRevert(); diff --git a/sot/inc/sot/storage.hxx b/sot/inc/sot/storage.hxx index 26258bc..4f602e9 100644 --- a/sot/inc/sot/storage.hxx +++ b/sot/inc/sot/storage.hxx @@ -96,6 +96,7 @@ public: sal_BoolGetProperty( const String rName, ::com::sun::star::uno::Any rValue ); ::com::sun::star::uno::Reference ::com::sun::star::io::XInputStream GetXInputStream() const; +virtual sal_Size remainingSize(); }; #ifndef SOT_DECL_SOTSTORAGESTREAM_DEFINED diff --git a/sot/source/sdstor/stg.cxx b/sot/source/sdstor/stg.cxx index 4c4be65..18e95f6 100644 --- a/sot/source/sdstor/stg.cxx +++ b/sot/source/sdstor/stg.cxx @@ -252,6 +252,13 @@ sal_Bool StorageStream::SetSize( sal_uLong nNewSize ) return sal_False; } +sal_uLong StorageStream::GetSize() const +{ +if( Validate() ) +return pEntry-GetSize(); +return 0; +} + sal_Bool StorageStream::Commit() { if( !Validate() ) diff --git a/sot/source/sdstor/storage.cxx b/sot/source/sdstor/storage.cxx index 897f9fe..4d24f1f 100644 --- a/sot/source/sdstor/storage.cxx +++ b/sot/source/sdstor/storage.cxx @@ -298,6 +298,13 @@ sal_uInt32 SotStorageStream::GetSize() const return nSize; } +sal_Size SotStorageStream::remainingSize() +{ +if (pOwnStm) +return pOwnStm-GetSize() - Tell(); +return SvStream::remainingSize(); +} + /* |*SotStorageStream::CopyTo() |* diff --git
Re: [REVIEW 3-5] fdo#47644 performance regression on largish .doc
Hi there, On Thu, 2012-05-10 at 16:06 +0100, Caolán McNamara wrote: fdo#47644 big 18 meg .doc is super dooper slow to load, mostly because of the extra checks I added to sanity test .doc file during parsing. :-) 1st attachment gets us back to where we came from in terms of performance there. Great - I've pushed that to libreoffice-3-5. The second one is maybe a bit more contentious for 3-5, but I include it for a look-over. Basically there's a page chain in the ole2 format and to seek to a location you walk through the chain until you get to the correct page. Yep - it's a FAT structure; nasty one-slot linked list. I see that in practice most documents have their chain in a presorted order (I have no idea if having them unsorted is actually a sign of a broken document or if its legal) It is legal, but it's highly unlikely. So second attachment keeps the chain if its one of the circumstances where we have to parse the whole thing anyway, and if not (the usual one) then if we're seeking an (fairly arbitrary) large number of steps where its likely that the cost-benefit is in favour of it then generate the chain once and hold on to it for reuse until there any event which might invalidate the chain. So - I'm a bit lost with the second patch. The FAT is essentially a linked list; and we walk it to find the block we want, so: nRel = 3; - go three steps along the linked list. A - B - C == C ... Your pageCache looks like the right approach, and turns this into a simple vector we can rapidly look up: [0] = A, [1] = B, [2] = C And so on - hopefully I'm there so far :-) With the 512 byte block size we expect, that turns a 100Mb OLE2 stream into a 800k vector (which is just fine). Second patch knocks about 50 seconds off load time. First patch knocks some unknown but 10 mins off load time. My question is thus about: [snip] std::vectorsal_Int32::iterator aI; if (m_bSortedPageChain) aI = std::lower_bound(m_aPagesCache.begin(), m_aPagesCache.end(), nBgn); else aI = std::find(m_aPagesCache.begin(), m_aPagesCache.end(), nBgn); if (aI == m_aPagesCache.end()) { SAL_WARN(sot, Unknown page position); nBgn = STG_EOF; } else { size_t nBgnDistance = std::distance(m_aPagesCache.begin(), aI); size_t nIndex = nBgnDistance + nRel; if (nIndex m_aPagesCache.size()) { nRel = m_aPagesCache.size() - nBgnDistance; nIndex = m_aPagesCache.size() - 1; } else nRel = 0; nLast = nIndex ? m_aPagesCache[nIndex - 1] : STG_EOF; nBgn = m_aPagesCache[nIndex]; } [/snip] Surely that should read: nBgn = m_aPageCache[nRel]; or did I go a bit crazy ? [ I'm trying to turn the above into that mentally and giving up ;-]. Otherwise the clearing of the cache looks sensible; though the: StgFATStrm::SetPage(); should prolly just update that vector as well as doing the nasty fat walk so: m_aPageCache[nBlocks] = nNewPage; or somesuch (assuming it is long enough for that). Otherwise it looks really lovely. Thanks ! Michael. -- michael.me...@suse.com , Pseudo Engineer, itinerant idiot ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: [REVIEW 3-5] fdo#47644 performance regression on largish .doc
On Thu, 2012-05-10 at 17:37 +0100, Michael Meeks wrote: Surely that should read: nBgn = m_aPageCache[nRel]; or did I go a bit crazy ? [ I'm trying to turn the above into that mentally and giving up ;-]. how about a 4 element chain of 10 11 12 13 with a scenario of nBgn = 11, nRel = 2, desired result is 13 so nBgn of 11 gives a nBgnDistance of 1 so desired nIndex is 3, nBgnDistance (1) + nRel (2); so nBgn = m_aPagesCache[nIndex]; and set nRel to 0 to indicate successfully stepped to desired page with an alternative scenario of nBgn = 11, nRel = 8 get nBgnDistance of 1 again, so desired nIndex of 9, clip it as its out of scale to the max of 3, reduce nRel by how valid steps we took (2) to give 6 un-achievable steps oops my out-of-bounds code is all wrong :-) fixed with 75e7643e22bcf674739ca890bfc06f80e872624b StgFATStrm::SetPage(); should prolly just update that vector as well as doing the nasty fat walk so: m_aPageCache[nBlocks] = nNewPage; would have to check around the insertion point if the sort order is still valid afterwards I guess, throwing away everything is my current cowardly conservative approach on any modifications C. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice