Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "TikaGeographicInformationParser" page has been changed by GauthamGowrishankar: https://wiki.apache.org/tika/TikaGeographicInformationParser?action=diff&rev1=4&rev2=5 Currently Apache Tika lacks the required support to parse .iso19139 files that are crawled from the Acadis websites.There has been a issue that has been created by Prasanth Iyer - (https://issues.apache.org/jira/browse/TIKA-1479) TIKA-1479. I would be continuing from where Prasanth left. The Progress is as below. + (https://issues.apache.org/jira/browse/TIKA-1479) TIKA-1479. The Progress is as below. 1. Extract the Meta Data using Apache SIS library (Martin has been a great source of support in this regard).<<BR>> 2. Customize the Meta Data extracted to construct Meta Data as key multi-value.<<BR>> 3. The format finalized so far has been key1->[val1,val2..] , key2->[val1,val2...].<<BR>> I would like suggestions on the below progress. - Default Meta Data extracted from Apache SIS framework is as below '''Default Meta Data''' ------------------------ Metadata - +-Character set……………………………………………………………………………………………… UTF-8 + +-Character set……………………………………………………………………………………………… UTF-8 <<BR>> - +-Contact + +-Contact <<BR>> - ¦ +-Role…………………………………………………………………………………………………………… Resource provider + ¦ +-Role…………………………………………………………………………………………………………… Resource provider <<BR>> - ¦ +-Party + ¦ +-Party <<BR>> - ¦ +-Name………………………………………………………………………………………………… UCAR/NCAR - CISL - ACADIS + ¦ +-Name………………………………………………………………………………………………… UCAR/NCAR - CISL - ACADIS <<BR>> - +-Identification info + +-Identification info <<BR>> - ¦ +-Citation + ¦ +-Citation <<BR>> - ¦ ¦ +-Title……………………………………………………………………………………………… Barrow Atqasuk ARCSS Plant + ¦ ¦ +-Title……………………………………………………………………………………………… Barrow Atqasuk ARCSS Plant <<BR>> - ¦ ¦ +-Date (1 of 2) + ¦ ¦ +-Date (1 of 2) <<BR>> - ¦ ¦ ¦ +-Date……………………………………………………………………………………… Dec 16, 2013 12:00:00 AM + ¦ ¦ ¦ +-Date……………………………………………………………………………………… Dec 16, 2013 12:00:00 AM <<BR>> - ¦ ¦ ¦ +-Date type………………………………………………………………………… Creation + ¦ ¦ ¦ +-Date type………………………………………………………………………… Creation <<BR>> - ¦ ¦ +-Date (2 of 2) + ¦ ¦ +-Date (2 of 2) <<BR>> - ¦ ¦ ¦ +-Date……………………………………………………………………………………… Feb 5, 2015 12:00:00 AM + ¦ ¦ ¦ +-Date……………………………………………………………………………………… Feb 5, 2015 12:00:00 AM <<BR>> - ¦ ¦ ¦ +-Date type………………………………………………………………………… Modified + ¦ ¦ ¦ +-Date type………………………………………………………………………… Modified <<BR>> - ¦ ¦ +-Cited responsible party + ¦ ¦ +-Cited responsible party <<BR>> - ¦ ¦ +-Role……………………………………………………………………………………… Point of contact + ¦ ¦ +-Role……………………………………………………………………………………… Point of contact <<BR>> - ¦ ¦ +-Party + ¦ ¦ +-Party <<BR>> - ¦ ¦ +-Name…………………………………………………………………………… Robert Hollister + ¦ ¦ +-Name…………………………………………………………………………… Robert Hollister <<BR>> - ¦ ¦ +-Contact info + ¦ ¦ +-Contact info <<BR>> - ¦ ¦ +-Address + ¦ ¦ +-Address <<BR>> - ¦ ¦ +-Electronic mail address…… [email protected] + ¦ ¦ +-Electronic mail address…… [email protected] <<BR>> - ¦ +-Abstract………………………………………………………………………………………………… These files contain data representing the periodic plant measures of species within each plot in a text tab delimited format. The data presented are seasonal growth of graminoids (length of leaf and length of inflorescence) and seasonal flowering of all species (number of inflorescences in flower within a plot), collected weekly during the summers of 2012-20XX for a subset of 30 grid plots at two sites (Barrow ARCSS grid and Atqasuk ARCSS grid). + ¦ +-Abstract………………………………………………………………………………………………… These files contain data representing the periodic plant measures of species within each plot in a text tab delimited format. <<BR>> The data presented are seasonal growth of graminoids (length of leaf and length of inflorescence) and seasonal flowering of all species (number of inflorescences in flower within a plot), collected weekly during the summers of 2012-20XX for a subset of 30 grid plots at two sites (Barrow ARCSS grid and Atqasuk ARCSS grid). <<BR>> - ¦ +-Status……………………………………………………………………………………………………… On going + ¦ +-Status……………………………………………………………………………………………………… On going <<BR>> - ¦ +-Point of contact + ¦ +-Point of contact <<BR>> - ¦ ¦ +-Role………………………………………………………………………………………………… Point of contact + ¦ ¦ +-Role………………………………………………………………………………………………… Point of contact <<BR>> - ¦ ¦ +-Party + ¦ ¦ +-Party <<BR>> - ¦ ¦ +-Name……………………………………………………………………………………… Robert Hollister + ¦ ¦ +-Name……………………………………………………………………………………… Robert Hollister <<BR>> - ¦ ¦ +-Contact info + ¦ ¦ +-Contact info <<BR>> - ¦ ¦ +-Address + ¦ ¦ +-Address <<BR>> - ¦ ¦ +-Electronic mail address……………… [email protected] + ¦ ¦ +-Electronic mail address……………… [email protected] <<BR>> - ¦ +-Resource format + ¦ +-Resource format <<BR>> - ¦ ¦ +-Format specification citation + ¦ ¦ +-Format specification citation <<BR>> - ¦ ¦ +-Alternate title………………………………………………………… Other ASCII + ¦ ¦ +-Alternate title………………………………………………………… Other ASCII <<BR>> - ¦ +-Descriptive keywords (1 of 5) + ¦ +-Descriptive keywords (1 of 5) <<BR>> - ¦ ¦ +-Keyword………………………………………………………………………………………… EARTH SCIENCE > BIOSPHERE > TERRESTRIAL ECOSYSTEMS > ALPINE/TUNDRA + ¦ ¦ +-Keyword………………………………………………………………………………………… EARTH SCIENCE > BIOSPHERE > TERRESTRIAL ECOSYSTEMS > ALPINE/TUNDRA <<BR>> - ¦ ¦ +-Type………………………………………………………………………………………………… Theme + ¦ ¦ +-Type………………………………………………………………………………………………… Theme <<BR>> - ¦ ¦ +-Thesaurus name + ¦ ¦ +-Thesaurus name <<BR>> - ¦ ¦ +-Title…………………………………………………………………………………… NASA/GCMD Earth Science Keywords + ¦ ¦ +-Title…………………………………………………………………………………… NASA/GCMD Earth Science Keywords <<BR>> - ¦ ¦ +-Alternate title………………………………………………………… Science and Services Keywords + ¦ ¦ +-Alternate title………………………………………………………… Science and Services Keywords <<BR>> - ¦ ¦ +-Date + ¦ ¦ +-Date <<BR>> - ¦ ¦ +-Date…………………………………………………………………………… May 21, 2014 12:00:00 AM + ¦ ¦ +-Date…………………………………………………………………………… May 21, 2014 12:00:00 AM <<BR>> - ¦ ¦ +-Date type……………………………………………………………… Revision + ¦ ¦ +-Date type……………………………………………………………… Revision <<BR>> - ¦ +-Descriptive keywords (2 of 5) + ¦ +-Descriptive keywords (2 of 5) <<BR>> - ¦ ¦ +-Keyword………………………………………………………………………………………… FIELD SURVEY + ¦ ¦ +-Keyword………………………………………………………………………………………… FIELD SURVEY <<BR>> - ¦ ¦ +-Type………………………………………………………………………………………………… Theme + ¦ ¦ +-Type………………………………………………………………………………………………… Theme <<BR>> - ¦ ¦ +-Thesaurus name + ¦ ¦ +-Thesaurus name <<BR>> - ¦ ¦ +-Title…………………………………………………………………………………… ACADIS Keywords + ¦ ¦ +-Title…………………………………………………………………………………… ACADIS Keywords <<BR>> - ¦ ¦ +-Alternate title………………………………………………………… Platforms + ¦ ¦ +-Alternate title………………………………………………………… Platforms <<BR>> - ¦ ¦ +-Date + ¦ ¦ +-Date <<BR>> - ¦ ¦ +-Date…………………………………………………………………………… Oct 7, 2014 12:00:00 AM + ¦ ¦ +-Date…………………………………………………………………………… Oct 7, 2014 12:00:00 AM <<BR>> - ¦ ¦ +-Date type……………………………………………………………… Revision + ¦ ¦ +-Date type……………………………………………………………… Revision <<BR>> - + <<BR>> '''Corresponding Customized Meta Data is as below''' ----------------------------------------------- - CharacterSet-->UTF-8 + CharacterSet-->UTF-8 <<BR>> - ContactRole-->RESOURCE_PROVIDER + ContactRole-->RESOURCE_PROVIDER <<BR>> - ContactPartyName-->UCAR/NCAR - CISL - ACADIS + ContactPartyName-->UCAR/NCAR - CISL - ACADIS <<BR>> - IdentificationInfoCitationTitle-->Barrow Atqasuk ARCSS Plant + IdentificationInfoCitationTitle-->Barrow Atqasuk ARCSS Plant <<BR>> - CitationDateCREATION-->Mon Dec 16 00:00:00 PST 2013 + CitationDateCREATION-->Mon Dec 16 00:00:00 PST 2013 <<BR>> - CitationDatemodified-->Thu Feb 05 00:00:00 PST 2015 + CitationDatemodified-->Thu Feb 05 00:00:00 PST 2015 <<BR>> - CitedResponsiblePartyRole-->Role[POINT_OF_CONTACT] + CitedResponsiblePartyRole-->Role[POINT_OF_CONTACT] <<BR>> - CitedResponsiblePartyName-->Robert Hollister + CitedResponsiblePartyName-->Robert Hollister <<BR>> - CitedResponsiblePartyOrganizationName-->null + CitedResponsiblePartyOrganizationName-->null <<BR>> - CitedResponsiblePartyPositionName-->null + CitedResponsiblePartyPositionName-->null <<BR>> - CitedResponsiblePartyEMail-->[email protected] + CitedResponsiblePartyEMail-->[email protected] <<BR>> - IdentificationInfoAbstract-->These files contain data representing the periodic plant measures of species within each plot in a text tab delimited format. The data presented are seasonal growth of graminoids (length of leaf and length of inflorescence) and seasonal flowering of all species (number of inflorescences in flower within a plot), collected weekly during the summers of 2012-20XX for a subset of 30 grid plots at two sites (Barrow ARCSS grid and Atqasuk ARCSS grid). + IdentificationInfoAbstract-->These files contain data representing the periodic plant measures of species within each plot in a text tab delimited format.The data presented are seasonal growth of graminoids (length of leaf and length of inflorescence) and seasonal flowering of all species (number of inflorescences in flower within a plot), collected weekly during the summers of 2012-20XX for a subset of 30 grid plots at two sites (Barrow ARCSS grid and Atqasuk ARCSS grid). <<BR>> - IdentificationInfoStatus-->ON_GOING + IdentificationInfoStatus-->ON_GOING <<BR>> - ResourceFormatSpecificationAlternativeTitle-->Other ASCII + ResourceFormatSpecificationAlternativeTitle-->Other ASCII <<BR>> - IdentificationInfoLanguage-->English + IdentificationInfoLanguage-->English <<BR>> - IdentificationInfoTopicCategory-->BIOTA + IdentificationInfoTopicCategory-->BIOTA <<BR>> - DescriptiveKeyWords 1 + DescriptiveKeyWords 1 <<BR>> - ======================= + ======================= <<BR>> - Keywords-->EARTH SCIENCE > BIOSPHERE > TERRESTRIAL ECOSYSTEMS > ALPINE/TUNDRA + Keywords-->EARTH SCIENCE > BIOSPHERE > TERRESTRIAL ECOSYSTEMS > ALPINE/TUNDRA <<BR>> - KeywordsType-->THEME + KeywordsType-->THEME <<BR>> - ThesaurusNameTitle-->NASA/GCMD Earth Science Keywords + ThesaurusNameTitle-->NASA/GCMD Earth Science Keywords <<BR>> - ThesaurusNameAlternativeTitle-->[Science and Services Keywords] + ThesaurusNameAlternativeTitle-->[Science and Services Keywords] <<BR>> - ThesaurusNameDateREVISION-->Wed May 21 00:00:00 PDT 2014 + ThesaurusNameDateREVISION-->Wed May 21 00:00:00 PDT 2014 <<BR>> - DescriptiveKeyWords 2 + DescriptiveKeyWords 2 <<BR>> - ======================= + ======================= <<BR>> - Keywords-->FIELD SURVEY + Keywords-->FIELD SURVEY <<BR>> - KeywordsType-->THEME + KeywordsType-->THEME <<BR>> - ThesaurusNameTitle-->ACADIS Keywords + ThesaurusNameTitle-->ACADIS Keywords <<BR>> - ThesaurusNameAlternativeTitle-->[Platforms] + ThesaurusNameAlternativeTitle-->[Platforms] <<BR>> - ThesaurusNameDateREVISION-->Tue Oct 07 00:00:00 PDT 2014 + ThesaurusNameDateREVISION-->Tue Oct 07 00:00:00 PDT 2014 <<BR>> I definitely feel that the Key Names could be much shorter,your suggestion would be appreciated .
