[
https://issues.apache.org/jira/browse/TIKA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509431#comment-14509431
]
Hudson commented on TIKA-1580:
------------------------------
SUCCESS: Integrated in tika-trunk-jdk1.7 #643 (See
[https://builds.apache.org/job/tika-trunk-jdk1.7/643/])
TIKA-1580: Fix to allow test to run on Windows with space in folder (dmeikle:
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1675679)
*
/tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/isatab/ISArchiveParserTest.java
> ISA-Tab parsers
> ---------------
>
> Key: TIKA-1580
> URL: https://issues.apache.org/jira/browse/TIKA-1580
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Reporter: Giuseppe Totaro
> Assignee: Chris A. Mattmann
> Priority: Minor
> Labels: new-parser
> Fix For: 1.8
>
> Attachments: TIKA-1580.Mattmann.Totaro.032515.patch.txt,
> TIKA-1580.patch, TIKA-1580.v02.patch,
> TIKA-1580.v03.2.Mattmann.Totaro.03262015.patch
>
>
> We are going to add parsers for ISA-Tab data formats.
> ISA-Tab files are related to [ISA Tools|http://www.isa-tools.org/] which help
> to manage an increasingly diverse set of life science, environmental and
> biomedical experiments that employing one or a combination of technologies.
> The ISA tools are built upon _Investigation_, _Study_, and _Assay_ tabular
> format. Therefore, ISA-Tab data format includes three types of file:
> Investigation file ({{a_xxxx.txt}}), Study file ({{s_xxxx.txt}}), Assay file
> ({{a_xxxx.txt}}). These files are organized as [top-down
> hierarchy|http://www.isa-tools.org/format/specification/]: An Investigation
> file includes one or more Study files: each Study files includes one or more
> Assay files.
> Essentially, the Investigation files contains high-level information about
> the related study, so it provides only metadata about ISA-Tab files.
> More details on file format specification are [available
> online|http://isatab.sourceforge.net/docs/ISA-TAB_release-candidate-1_v1.0_24nov08.pdf].
> The patch in attachment provides a preliminary version of ISA-Tab parsers
> (there are three parsers; one parser for each ISA-Tab filetype):
> * {{ISATabInvestigationParser.java}}: parses Investigation files. It extracts
> only metadata.
> * {{ISATabStudyParser.java}}: parses Study files.
> * {{ISATabAssayParser.java}}: parses Assay files.
> The most important improvements are:
> * Combine these three parsers in order to parse an ISArchive
> * Provide a better mapping of both study and assay data on XHML. Currently,
> {{ISATabStudyParser}} and {{ISATabAssayParser}} provide a naive mapping
> function relying on [Apache Commons
> CSV|https://commons.apache.org/proper/commons-csv/].
> Thanks for supporting me on this work [~chrismattmann].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)