Re: What happened to our document repository for type detection tests?
> On Jun 22, 2024, at 6:59 AM, Marcus wrote: > > Am 22.06.24 um 14:53 schrieb Bidouille: >>> I remember from old time that the QA team at Sun/Oracle had really a >>> lot of documents for general and special testing. >>> >>> These were not part of the code repository and were loaded from their >>> own test software. Maybe this is the link to the storage outside of >>> the project. >> If you have an URL, you can try to get with the WayBack machine >> https://wayback-api.archive.org/ > > they were stored on an internal server. The Apache Tika and Apache POI projects make use of Common Crawl to create a large corpus for regression tests. https://commoncrawl.org Perhaps we can start to do the same? We can ask for help from Tika at [email protected] or POI at [email protected] Best, Dave > > Marcus > > > - > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: What happened to our document repository for type detection tests?
Am 22.06.24 um 14:53 schrieb Bidouille: I remember from old time that the QA team at Sun/Oracle had really a lot of documents for general and special testing. These were not part of the code repository and were loaded from their own test software. Maybe this is the link to the storage outside of the project. If you have an URL, you can try to get with the WayBack machine https://wayback-api.archive.org/ they were stored on an internal server. Marcus - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: What happened to our document repository for type detection tests?
Hi Damjan I believe Carl Marcum was working on automated testing related to file types many moons ago... Maybe he has a backup copy or knows where these should be? Best, Pedro > On 06/22/2024 2:56 AM WEST Damjan Jovanovic wrote: > > > Hi > > While doing some analysis and refactoring of our unit tests, I found a > really interesting - and important - test. > > It's in main/filter/qa/complex/filter/detection/typeDetection, and it > iterates through a repository of documents, trying to get OpenOffice to > detect the type of each document, and verifying it guessed correctly. This > is important for preventing (and helping fix) bugs such as 126270, where > some regression caused us to stop opening single-file OpenDocument files. > > In that directory, TypeDetection.props has the path to the documents: > # UNIX: > #TestDocumentPath=file:///net/margritte/usr/qaapi/dev/cws/filtercfg/docTypes > # WINDOWS > TestDocumentPath=//margritte/qaapi/dev/cws/filtercfg/docTypes > > and files.csv has their filenames, including: > Writer/AoE2a.rtf > Writer/Text_DOS.txt > Writer/Word2000.doc > Writer/Word2000_template.dot > any many more. > > Those documents are not in Git, and were never in trunk. > > It would be good if we could get those documents back and continue testing > them. > > Does anyone know what happened to them? > > Regards > Damjan - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: What happened to our document repository for type detection tests?
> I remember from old time that the QA team at Sun/Oracle had really a > lot of documents for general and special testing. > > These were not part of the code repository and were loaded from their > own test software. Maybe this is the link to the storage outside of > the project. If you have an URL, you can try to get with the WayBack machine https://wayback-api.archive.org/ - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: What happened to our document repository for type detection tests?
Am 22.06.24 um 03:56 schrieb Damjan Jovanovic: ... Those documents are not in Git, and were never in trunk. It would be good if we could get those documents back and continue testing them. Does anyone know what happened to them? I remember from old time that the QA team at Sun/Oracle had really a lot of documents for general and special testing. These were not part of the code repository and were loaded from their own test software. Maybe this is the link to the storage outside of the project. When these documents were not part of the handover to the ASF, then I think they are gone. Marcus - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
What happened to our document repository for type detection tests?
Hi While doing some analysis and refactoring of our unit tests, I found a really interesting - and important - test. It's in main/filter/qa/complex/filter/detection/typeDetection, and it iterates through a repository of documents, trying to get OpenOffice to detect the type of each document, and verifying it guessed correctly. This is important for preventing (and helping fix) bugs such as 126270, where some regression caused us to stop opening single-file OpenDocument files. In that directory, TypeDetection.props has the path to the documents: # UNIX: #TestDocumentPath=file:///net/margritte/usr/qaapi/dev/cws/filtercfg/docTypes # WINDOWS TestDocumentPath=//margritte/qaapi/dev/cws/filtercfg/docTypes and files.csv has their filenames, including: Writer/AoE2a.rtf Writer/Text_DOS.txt Writer/Word2000.doc Writer/Word2000_template.dot any many more. Those documents are not in Git, and were never in trunk. It would be good if we could get those documents back and continue testing them. Does anyone know what happened to them? Regards Damjan
