Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "GrobidJournalParser" page has been changed by ChrisMattmann:
https://wiki.apache.org/tika/GrobidJournalParser?action=diff&rev1=6&rev2=7

Comment:
- update Tika Server example.

              "org.apache.tika.parser.CompositeParser",
              "org.apache.tika.parser.journal.JournalParser"
          ],
-         "X-TIKA:content": 
"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nProceedings
 Template - WORD\n\n\nA Software Architecture-Based Framework for Highly 
\nDistributed and Data Intensive Scientific Applications \n\n \nChris A. 
Mattmann1, 2        Daniel J. Crichton1        Nenad Medvidovic2        Steve 
Hughes1 \n\n \n1Jet Propulsion Laboratory \n\nCalifornia Institute of 
Technology \nPasadena, CA 91109, USA 
\n\n{dan.crichton,mattmann,steve.hughes}@jpl.nasa.gov \n\n2Computer Science 
Department \nUniversity of Southern California  \n\nLos Angeles, CA 90089, USA 
\n{mattmann,neno}@usc.edu \n\n \nABSTRACT \nModern scientific research is 
increasingly conducted by virtual \ncommunities of scientists distributed 
around the world. The data \nvolumes created by these communities are extremely 
large, and \ngrowing rapidly. The management of the resulting highly 
\ndistributed, virtual data systems is a complex task, characterized \nby a 
number of formidable technical challenges, many of which \nare of a software 
engineering nature.  In this paper we describe \nour experience over the past 
seven years in constructing and \ndeploying OODT, a software framework that 
supports large, \ndistributed, virtual scientific communities. We outline the 
key \nsoftware engineering challenges that we faced, and addressed, \nalong the 
way. We argue that a major contributor to the success of \nOODT was its 
explicit focus on software architecture. We \ndescribe several large-scale, 
real-world deployments of OODT, \nand the manner in which OODT helped us to 
address the domain-\nspecific challenges induced by each deployment.  
\n\nCategories and Subject Descriptors \nD.2 Software Engineering, D.2.11 
Domain Specific Architectures \n\nKeywords \nOODT, Data Management, Software 
Architecture. \n\n1. INTRODUCTION \nSoftware systems of today are very large, 
highly complex, \n\noften widely distributed, increasingly decentralized, 
dynamic, and \nmobile.  There are many causes behind this, spanning virtually 
all \nfacets of human endeavor: desired advances in education, \nentertainment, 
medicine, military technology, \ntelecommunications, transportation, and so on. 
  \n\nOne major driver of software\u2019s growing complexity is \nscientific 
research and exploration.  Today\u2019s scientists are solving \nproblems of 
until recently unimaginable complexity with the help \nof software.  They also 
actively and regularly collaborate with \n\ncolleagues around the world, 
something that ..snip",
+         "X-TIKA:content": 
"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nProceedings
 Template - WORD\n\n\nA Software Architecture-Based Framework for Highly 
\nDistributed and Data Intensive Scientific Applications \n\n \nChris A. 
Mattmann1, 2        Daniel J. Crichton1        Nenad Medvidovic2        Steve 
Hughes1 \n\n \n1Jet Propulsion Laboratory \n\nCalifornia Institute of 
Technology \nPasadena, CA 91109, USA 
\n\n{dan.crichton,mattmann,steve.hughes}@jpl.nasa.gov \n\n2Computer Science 
Department \nUniversity of Southern California  \n\nLos Angeles, CA 90089, USA 
\n{mattmann,neno}@usc.edu \n\n \nABSTRACT \nModern scientific research is 
increasingly conducted by virtual \ncommunities of scientists distributed 
around the world. The data \nvolumes created by these communities are extremely 
large, and \ngrowing rapidly. The management of the resulting highly 
\ndistributed, virtual data systems is a complex task, characterized \nby a 
number of formidable technical challenges, many of which \nare of a software 
engineering nature.  In this paper we describe \nour experience over the past 
seven years in constructing and \ndeploying OODT, a software framework that 
supports large, \ndistributed, virtual scientific communities. We outline the 
key \nsoftware engineering challenges that we faced, and addressed, \nalong the 
way. We argue that a major contributor to the success of \nOODT was its 
explicit focus on software architecture. We \ndescribe several large-scale, 
real-world deployments of OODT, \nand the manner in which OODT helped us to 
address the domain-\nspecific challenges induced by each deployment.  
\n\nCategories and Subject Descriptors \nD.2 Software Engineering, D.2.11 
Domain Specific Architectures \n\nKeywords \nOODT, Data Management, Software 
Architecture. \n\n1. INTRODUCTION ..snip..",
-         "X-TIKA:parse_time_millis": "12348",
+         "X-TIKA:parse_time_millis": "957",
          "access_permission:assemble_document": "true",
          "access_permission:can_modify": "true",
          "access_permission:can_print": "true",
@@ -156, +156 @@

          "dc:title": "Proceedings Template - WORD",
          "dcterms:created": "2006-02-15T21:13:58Z",
          "dcterms:modified": "2006-02-15T21:16:01Z",
-         "grobid:header_Abstract": "Modern scientific research is increasingly 
conducted by virtual communities of scientists distributed around the world. 
The data volumes created by these communities are extremely large, and growing 
rapidly. The management of the resulting highly distributed, virtual data 
systems is a complex task, characterized by a number of formidable technical 
challenges, many of which are of a software engineering nature. In this paper 
we describe our experience over the past seven years in constructing and 
deploying OODT, a software framework that supports large, distributed, virtual 
scientific communities. We outline the key software engineering challenges that 
we faced, and addressed, along the way. We argue that a major contributor to 
the success of OODT was its explicit focus on software architecture. We 
describe several large-scale, real-world deployments of OODT, and the manner in 
which OODT helped us to address the domain-specific challenges induced by each 
deployment.",
-         "grobid:header_AbstractHeader": "ABSTRACT",
-         "grobid:header_Address": "Pasadena, CA 91109, USA Los Angeles, CA 
90089, USA",
+         "grobid:header_Address": "Pasadena, CA 91109 USA Los Angeles, CA 
90089 USA ",
-         "grobid:header_Affiliation": "1 Jet Propulsion Laboratory California 
Institute of Technology ; 2 Computer Science Department University of Southern 
California",
+         "grobid:header_Affiliation": "1 Jet Propulsion Laboratory California 
Institute of Technology; 2 Computer Science Department University of Southern 
California",
-         "grobid:header_Authors": "Chris A. Mattmann 1, 2 Daniel J. Crichton 1 
Nenad Medvidovic 2 Steve Hughes 1",
+         "grobid:header_Authors": "Chris A Mattmann 1,2 Daniel J Crichton 1 
Nenad  Medvidovic 2 Steve  Hughes 1 ",
+         "grobid:header_Class": "org.apache.tika.metadata.Metadata",
+         "grobid:header_FullAffiliations": "[Affiliation {orgName=Jet 
Propulsion Laboratory California Institute of Technology , address=Pasadena, CA 
91109 USA},Affiliation {orgName=Computer Science Department University of 
Southern California , address=Los Angeles, CA 90089 USA}[Affiliation 
{orgName=Jet Propulsion Laboratory California Institute of Technology , 
address=Pasadena, CA 91109 USA},Affiliation {orgName=Computer Science 
Department University of Southern California , address=Los Angeles, CA 90089 
USA}]",
+         "grobid:header_Keyword": "\"D2 Software Engineering, D211 Domain 
Specific Architectures\"",
+         "grobid:header_TEIJSONSource": 
"{\"TEI\":{\"text\":{\"xml:lang\":\"en\"},\"teiHeader\": ..snip",
+         "grobid:header_TEIXMLSource": "<?xml version=\"1.0\" 
encoding=\"UTF-8\"?>\n<?xml-model 
href=\"file:///Users/mattmann/git/grobid/grobid-home/schemas/rng/Grobid.rng\" 
schematypens=\"http://relaxng.org/ns/structure/1.0\";?>\n<TEI 
xmlns=\"http://www.tei-c.org/ns/1.0\";>\n\t<teiHeader 
xml:lang=\"en\">\n\t\t<fileDesc>\n\t\t\t<titleStmt>\n\t\t\t\t<title level=\"a\" 
type=\"main\">A Software Architecture-Based Framework for Highly Distributed 
and Data Intensive Scientific Applications</title>..snip..</TEI>\n",
-         "grobid:header_BeginPage": "-1",
-         "grobid:header_Class": "class org.grobid.core.data.BiblioItem",
-         "grobid:header_Email": 
"{dan.crichton,mattmann,steve.hughes}@jpl.nasa.gov ; {mattmann,neno}@usc.edu",
-         "grobid:header_EndPage": "-1",
-         "grobid:header_Error": "true",
-         "grobid:header_FirstAuthorSurname": "Mattmann",
-         "grobid:header_FullAffiliations": "[Affiliation{name='null', 
url='null', institutions=[California Institute of Technology], 
departments=null, laboratories=[Jet Propulsion Laboratory], country='USA', 
postCode='91109', postBox='null', region='CA', settlement='Pasadena', 
addrLine='null', marker='1', addressString='null', affiliationString='null', 
failAffiliation=false}, Affiliation{name='null', url='null', 
institutions=[University of Southern California], departments=[Computer Science 
Department], laboratories=null, country='USA', postCode='90089', 
postBox='null', region='CA', settlement='Los Angeles', addrLine='null', 
marker='2', addressString='null', affiliationString='null', 
failAffiliation=false}]",
-         "grobid:header_FullAuthors": "[Chris A Mattmann, Daniel J Crichton, 
Nenad Medvidovic, Steve Hughes]",
-         "grobid:header_Item": "-1",
-         "grobid:header_Keyword": "Categories and Subject Descriptors D2 
Software Engineering, D211 Domain Specific Architectures Keywords OODT, Data 
Management, Software Architecture",
-         "grobid:header_Keywords": "[D2 Software Engineering, D211 Domain 
Specific Architectures  (type:subject-headers), Keywords  
(type:subject-headers), OODT, Data Management, Software Architecture  
(type:subject-headers)]",
-         "grobid:header_Language": "en",
-         "grobid:header_NbPages": "-1",
-         "grobid:header_OriginalAuthors": "Chris A. Mattmann 1, 2 Daniel J. 
Crichton 1 Nenad Medvidovic 2 Steve Hughes 1",
          "grobid:header_Title": "A Software Architecture-Based Framework for 
Highly Distributed and Data Intensive Scientific Applications",
          "meta:author": "End User Computing Services",
          "meta:creation-date": "2006-02-15T21:13:58Z",
@@ -189, +178 @@

          "xmpTPg:NPages": "10"
      }
  ]
+ 
  }}}
  

Reply via email to