Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/complex.mbox URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/complex.mbox?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/complex.mbox (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/complex.mbox Mon Dec 28 23:10:16 2015 @@ -0,0 +1,291 @@ +From core-user-return-14700-apmail-hadoop-core-user-archive=hadoop.apache....@hadoop.apache.org Mon Jun 01 04:28:28 2009 +Return-Path: <core-user-return-14700-apmail-hadoop-core-user-archive=hadoop.apache....@hadoop.apache.org> +Delivered-To: [email protected] +Received: (qmail 19921 invoked from network); 1 Jun 2009 04:28:28 -0000 +Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) + by minotaur.apache.org with SMTP; 1 Jun 2009 04:28:28 -0000 +Received: (qmail 84995 invoked by uid 500); 1 Jun 2009 04:28:38 -0000 +Delivered-To: [email protected] +Received: (qmail 84895 invoked by uid 500); 1 Jun 2009 04:28:38 -0000 +Mailing-List: contact [email protected]; run by ezmlm +Precedence: bulk +List-Help: <mailto:[email protected]> +List-Unsubscribe: <mailto:[email protected]> +List-Post: <mailto:[email protected]> +List-Id: <core-user.hadoop.apache.org> +Reply-To: [email protected] +Delivered-To: mailing list [email protected] +Received: (qmail 84885 invoked by uid 99); 1 Jun 2009 04:28:38 -0000 +Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) + by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 04:28:38 +0000 +X-ASF-Spam-Status: No, hits=1.2 required=10.0 + tests=SPF_NEUTRAL +X-Spam-Check-By: apache.org +Received-SPF: neutral (athena.apache.org: local policy) +Received: from [69.147.107.21] (HELO mrout2-b.corp.re1.yahoo.com) (69.147.107.21) + by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 04:28:26 +0000 +Received: from SNV-EXPF01.ds.corp.yahoo.com (snv-expf01.ds.corp.yahoo.com [207.126.227.250]) + by mrout2-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id n514QYA6099963 + for <[email protected]>; Sun, 31 May 2009 21:26:35 -0700 (PDT) +DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; + h=received:user-agent:date:subject:from:to:message-id: + thread-topic:thread-index:in-reply-to:mime-version:content-type: + content-transfer-encoding:x-originalarrivaltime; + b=YVtSNdgjeeSBS1yY3XDolul49i+HrgNG7QszMo9LzGnrwejjgsl5+iUM6EiQgEpV +Received: from SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.9]) by SNV-EXPF01.ds.corp.yahoo.com with Microsoft SMTPSVC(6.0.3790.3959); + Sun, 31 May 2009 21:26:34 -0700 +Received: from 10.66.92.213 ([10.66.92.213]) by SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.58]) with Microsoft Exchange Server HTTP-DAV ; + Mon, 1 Jun 2009 04:26:33 +0000 +User-Agent: Microsoft-Entourage/12.17.0.090302 +Date: Mon, 01 Jun 2009 09:56:31 +0530 +Subject: Re: question about when shuffle/sort start working +From: Jothi Padmanabhan <[email protected]> +To: <[email protected]> +Message-ID: <c649564f.1435f%[email protected]> +Thread-Topic: question about when shuffle/sort start working +Thread-Index: AcnicSNoBw19cMU8UEaXwAdZ1YYhuw== +In-Reply-To: <[email protected]> +Mime-version: 1.0 +Content-type: text/plain; + charset="US-ASCII" +Content-transfer-encoding: 7bit +X-OriginalArrivalTime: 01 Jun 2009 04:26:34.0501 (UTC) FILETIME=[257EAB50:01C9E271] +X-Virus-Checked: Checked by ClamAV on apache.org + +When a Mapper completes, MapCompletionEvents are generated. Reducers try to +fetch map outputs for a given map only on the receipt of such events. + +Jothi + + +On 5/30/09 10:00 AM, "Jianmin Woo" <[email protected]> wrote: + +> Hi, +> I am being confused by the protocol between mapper and reducer. When mapper +> emitting the (key,value) pair done, is there any signal the mapper send out to +> hadoop framework in protocol to indicate that map is done and the shuffle/sort +> can begin for reducer? If there is no this signal in protocol, when the +> framework begin the shuffle/sort? +> +> Thanks, +> Jianmin +> +> +> +> + + +From core-user-return-14701-apmail-hadoop-core-user-archive=hadoop.apache....@hadoop.apache.org Mon Jun 01 05:31:14 2009 +Return-Path: <core-user-return-14701-apmail-hadoop-core-user-archive=hadoop.apache....@hadoop.apache.org> +Delivered-To: [email protected] +Received: (qmail 38243 invoked from network); 1 Jun 2009 05:31:14 -0000 +Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) + by minotaur.apache.org with SMTP; 1 Jun 2009 05:31:14 -0000 +Received: (qmail 15621 invoked by uid 500); 1 Jun 2009 05:31:24 -0000 +Delivered-To: [email protected] +Received: (qmail 15557 invoked by uid 500); 1 Jun 2009 05:31:24 -0000 +Mailing-List: contact [email protected]; run by ezmlm +Precedence: bulk +List-Help: <mailto:[email protected]> +List-Unsubscribe: <mailto:[email protected]> +List-Post: <mailto:[email protected]> +List-Id: <core-user.hadoop.apache.org> +Reply-To: [email protected] +Delivered-To: mailing list [email protected] +Received: (qmail 15547 invoked by uid 99); 1 Jun 2009 05:31:24 -0000 +Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) + by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 05:31:24 +0000 +X-ASF-Spam-Status: No, hits=2.2 required=10.0 + tests=HTML_MESSAGE,SPF_PASS +X-Spam-Check-By: apache.org +Received-SPF: pass (nike.apache.org: local policy) +Received: from [68.142.237.94] (HELO n9.bullet.re3.yahoo.com) (68.142.237.94) + by apache.org (qpsmtpd/0.29) with SMTP; Mon, 01 Jun 2009 05:31:11 +0000 +Received: from [68.142.237.88] by n9.bullet.re3.yahoo.com with NNFMP; 01 Jun 2009 05:30:50 -0000 +Received: from [67.195.9.82] by t4.bullet.re3.yahoo.com with NNFMP; 01 Jun 2009 05:30:49 -0000 +Received: from [67.195.9.99] by t2.bullet.mail.gq1.yahoo.com with NNFMP; 01 Jun 2009 05:30:49 -0000 +Received: from [127.0.0.1] by omp103.mail.gq1.yahoo.com with NNFMP; 01 Jun 2009 05:28:01 -0000 +X-Yahoo-Newman-Property: ymail-3 +X-Yahoo-Newman-Id: [email protected] +Received: (qmail 35264 invoked by uid 60001); 1 Jun 2009 05:30:49 -0000 +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1243834249; bh=R8qzdi/IbLyO8UwpnaujDpT9E+6bJ7nkmZN2803EmRk=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=vq4c6RIDbkuLPYd8mirusIXf6DqTb/IeT55In7W00Y5Sxx1ZiXBb78yE9+TDfXJ0elsEZvqv4ocyvolGE0eGtyYeJA0mZikpRNu6pidxPNpCplOcLHBRz7YQ7iERwv3TagRlWy2Xd3oD9ZeV0A05P7WUOiNNX1PUUJD1IVdrEZo= +DomainKey-Signature:a=rsa-sha1; q=dns; c=nofws; + s=s1024; d=yahoo.com; + h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; + b=6HXZV98ON5vBwmE/xS8stVD0D2F4dkMY7a0suX5KVTb736JdR8G59mqBq/dWcpbFTLiCLtxi18LMb/dU1RKRGOEdn3l3j/jKXhBrhIgfg3qtNskPedXDKBvn7JGXiSkqpA/tUtPjvc0Uuk8/LaA01SQTz40Engg7nD8/EJdIAhA=; +Message-ID: <[email protected]> +X-YMail-OSG: KzhhrJYVM1m.MCS6vRpRP2ZZO2PrfnbngosELDCIa91ZqvhJph4RdmzfUW0jw9W04RCSch1K730bPohwNpNBIk2QR_zt4_mfbhfq7YEPkSoz9LSXG90P9vIo5Fc8qyZN0U6vA9gtdyGQTpN5ahvillUH9nAF0TMWv2SvZJLjPlQ0Z0p8oK8ltBwGTgLrM8Jtdn9D29yoRyi3_EpVOfdD9OP.EK50Vr1XwSUYMbnpZ0WGHMwd.Yig7A6Elwadm3YVbfOdx2mfrG.jQsUAxQjRBNvbrOM57.FaE11kHTe9aoBWSeihNg-- +Received: from [216.145.54.7] by web111010.mail.gq1.yahoo.com via HTTP; Sun, 31 May 2009 22:30:49 PDT +X-Mailer: YahooMailRC/1277.43 YahooMailWebService/0.7.289.10 +References: <c649564f.1435f%[email protected]> +Date: Sun, 31 May 2009 22:30:49 -0700 (PDT) +From: Jianmin Woo <[email protected]> +Subject: Re: question about when shuffle/sort start working +To: [email protected] +In-Reply-To: <c649564f.1435f%[email protected]> +MIME-Version: 1.0 +Content-Type: multipart/alternative; boundary="0-1193839393-1243834249=:35091" +X-Virus-Checked: Checked by ClamAV on apache.org + +--0-1193839393-1243834249=:35091 +Content-Type: text/plain; charset=us-ascii + +Thanks a lot for your explanation, Jothi. + +So is this event generated by hadoop framework? Is there any API in mapper to fire this event? Actually, I am thinking to implement a mapper that will emit some <key, value> pairs, then fire this event to let the reducer works, the same mapper task then emit some other <key, value> pairs and repeat. Do you think is this logic feasible by current API? + +Thanks, +Jianmin + + + + + +________________________________ +From: Jothi Padmanabhan <[email protected]> +To: [email protected] +Sent: Monday, June 1, 2009 12:26:31 PM +Subject: Re: question about when shuffle/sort start working + +When a Mapper completes, MapCompletionEvents are generated. Reducers try to +fetch map outputs for a given map only on the receipt of such events. + +Jothi + + +On 5/30/09 10:00 AM, "Jianmin Woo" <[email protected]> wrote: + +> Hi, +> I am being confused by the protocol between mapper and reducer. When mapper +> emitting the (key,value) pair done, is there any signal the mapper send out to +> hadoop framework in protocol to indicate that map is done and the shuffle/sort +> can begin for reducer? If there is no this signal in protocol, when the +> framework begin the shuffle/sort? +> +> Thanks, +> Jianmin +> +> +> +> + + + +--0-1193839393-1243834249=:35091-- + + +From core-user-return-14702-apmail-hadoop-core-user-archive=hadoop.apache....@hadoop.apache.org Mon Jun 01 06:04:30 2009 +Return-Path: <core-user-return-14702-apmail-hadoop-core-user-archive=hadoop.apache....@hadoop.apache.org> +Delivered-To: [email protected] +Received: (qmail 53387 invoked from network); 1 Jun 2009 06:04:29 -0000 +Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) + by minotaur.apache.org with SMTP; 1 Jun 2009 06:04:29 -0000 +Received: (qmail 39066 invoked by uid 500); 1 Jun 2009 06:04:39 -0000 +Delivered-To: [email protected] +Received: (qmail 38970 invoked by uid 500); 1 Jun 2009 06:04:39 -0000 +Mailing-List: contact [email protected]; run by ezmlm +Precedence: bulk +List-Help: <mailto:[email protected]> +List-Unsubscribe: <mailto:[email protected]> +List-Post: <mailto:[email protected]> +List-Id: <core-user.hadoop.apache.org> +Reply-To: [email protected] +Delivered-To: mailing list [email protected] +Received: (qmail 38955 invoked by uid 99); 1 Jun 2009 06:04:39 -0000 +Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) + by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 06:04:39 +0000 +X-ASF-Spam-Status: No, hits=1.2 required=10.0 + tests=SPF_NEUTRAL +X-Spam-Check-By: apache.org +Received-SPF: neutral (athena.apache.org: local policy) +Received: from [216.145.54.172] (HELO mrout2.yahoo.com) (216.145.54.172) + by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 06:04:28 +0000 +Received: from SNV-EXBH01.ds.corp.yahoo.com (snv-exbh01.ds.corp.yahoo.com [207.126.227.249]) + by mrout2.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id n5163FGq038852 + for <[email protected]>; Sun, 31 May 2009 23:03:15 -0700 (PDT) +DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; + h=received:user-agent:date:subject:from:to:message-id: + thread-topic:thread-index:in-reply-to:mime-version:content-type: + content-transfer-encoding:x-originalarrivaltime; + b=rChE4SCnwtWaZpjhovkiXDKfDiVNdRRvsadSGG9S9bgvOexn/9/5JjEQx1pOR7Nb +Received: from SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.9]) by SNV-EXBH01.ds.corp.yahoo.com with Microsoft SMTPSVC(6.0.3790.3959); + Sun, 31 May 2009 23:03:15 -0700 +Received: from 10.66.92.213 ([10.66.92.213]) by SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.58]) with Microsoft Exchange Server HTTP-DAV ; + Mon, 1 Jun 2009 06:03:15 +0000 +User-Agent: Microsoft-Entourage/12.17.0.090302 +Date: Mon, 01 Jun 2009 11:33:13 +0530 +Subject: Re: question about when shuffle/sort start working +From: Jothi Padmanabhan <[email protected]> +To: <[email protected]> +Message-ID: <c6496cf9.1437c%[email protected]> +Thread-Topic: question about when shuffle/sort start working +Thread-Index: AcnifqWrLG6N7GAk7kqy9QalVWfegQ== +In-Reply-To: <[email protected]> +Mime-version: 1.0 +Content-type: text/plain; + charset="US-ASCII" +Content-transfer-encoding: 7bit +X-OriginalArrivalTime: 01 Jun 2009 06:03:15.0462 (UTC) FILETIME=[A7231260:01C9E27E] +X-Virus-Checked: Checked by ClamAV on apache.org + + +No you cannot raise this event yourself, this event is generated internally +by the framework. + +I am guessing that what you probably want is to have a chain of MapReduce +Jobs where the output of one is automatically fed as input to another. You +can look at these classes: JobControl and ChainMapper/ChainReducer. + +Jothi + +On 6/1/09 11:00 AM, "Jianmin Woo" <[email protected]> wrote: + +> Thanks a lot for your explanation, Jothi. +> +> So is this event generated by hadoop framework? Is there any API in mapper to +> fire this event? Actually, I am thinking to implement a mapper that will emit +> some <key, value> pairs, then fire this event to let the reducer works, the +> same mapper task then emit some other <key, value> pairs and repeat. Do you +> think is this logic feasible by current API? +> +> Thanks, +> Jianmin +> +> +> +> +> +> ________________________________ +> From: Jothi Padmanabhan <[email protected]> +> To: [email protected] +> Sent: Monday, June 1, 2009 12:26:31 PM +> Subject: Re: question about when shuffle/sort start working +> +> When a Mapper completes, MapCompletionEvents are generated. Reducers try to +> fetch map outputs for a given map only on the receipt of such events. +> +> Jothi +> +> +> On 5/30/09 10:00 AM, "Jianmin Woo" <[email protected]> wrote: +> +>> Hi, +>> I am being confused by the protocol between mapper and reducer. When mapper +>> emitting the (key,value) pair done, is there any signal the mapper send out +>> to +>> hadoop framework in protocol to indicate that map is done and the +>> shuffle/sort +>> can begin for reducer? If there is no this signal in protocol, when the +>> framework begin the shuffle/sort? +>> +>> Thanks, +>> Jianmin +>> +>> +>> +>> +> +> +> + +
Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/egyl03.gdas.200811.00Z.grb2 URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/egyl03.gdas.200811.00Z.grb2?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/egyl03.gdas.200811.00Z.grb2 ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/english.cp500.txt URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/english.cp500.txt?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/english.cp500.txt (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/english.cp500.txt Mon Dec 28 23:10:16 2015 @@ -0,0 +1 @@ +@@@@@@@@@É£ ⣣@×ÖæÅÙ@ % %@@@@@@@@@@@@Ȧ @ % %@@@@@@@@@@@@Ç¢@ % %@@@@@@@@@@@@⤣@@¢ ¥ ¢@ % %@@@@@@@@@@@@Ù ¢¤ ¢@ % % % % %@@@@@@@@@@@@Ù £ @¢ %@@@@@@@@@@@@æ£ ¢k@ ¢ ¢@@£ %@@@@@@@@@@@@צ @⨢£ ¢ %@@@@@@@@@@@@⨢£ ¢@ %@@@@@@@@@@@@ÉÂÔ@⨢£ @@Ù ¢ %@@@@@@@@@@@@â@¤¢ ¢¢@ ¢¤ @ £ %@@@@@@@@@@@@ŧ ¢¢@Á¥£ @@ ¤@¤¢ ¢¢ %@@@@@@@@@@@@¤¢ ¢¢@×£ ¢ % % %@@@@@@@@@@@@@ÉÂÔ@⨢£ ¢@@n@梣£¢@@n@É£ ⣣@×ÖæÅÙ@@n@ %@@@@@@@@@@@@@@@@@@Ȧ @@n@@ % %@@@@@@@@@@@@@@@@@@ÉÂÔ@É£ ⣣@×ÖæÅÙ@ñøõ@ŧ ¢¢@@ %@@@@@@@@@@@@@@@@@@@@@@@@פ£@ £¢Â¦¢ @P@¤¨Á ¢¢ ¢ % %@@@@@@@@@@@@@@@@@@@@@@@@Ö¥ ¥ ¦@@@j@@@Æ £¤ ¢@P@ £¢@@@@j@@@â £¢@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@j@@@× @Ä£@@@@j@@ % % % % % %@@@@@@@@@@@@@@@@@@@@@@@@@Ó @ % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ÉÂÔ@É£ ⣣@×ÖæÅÙ@ñøõ@ŧ ¢¢@¦¢££@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@MõôÒÂ] % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ã @¥ ¥ ¦@@£¤£ % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Æ£¢@P@ £¤ ¢@ £ % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@⨢£ @ @ £ % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ȧ @¤ ££ % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@⤣ % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ãÁÇÉãã`ÃÁãÉÁ@åôaÅÕÖåÉÁ@ÄÔä@Å¥¤£@MôõðÒÂ] % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ç £@Á V@Ù V % % %@@@@@@@@@@@@@@@@@@ã @ÉÂÔ@É£ ⣣@צ @ñøõ@@ÉÂÔ@É£ ⣣@צ @ %@@@@@@@@@@@@@@@@@@òøõ@¥ @ @¤ @¢@@¦ @¦£¦@@ %@@@@@@@@@@@@@@@@@@ £@¦£@@ £¥ @£ @@Ѥ¨@òk@òððùK@Ö@@ %@@@@@@@@@@@@@@@@@@£ @£ @ £¥ @£ @@¦£¦k@¨¤@@@ @ %@@@@@@@@@@@@@@@@@@ @£ ¢ @¤£¢@ £¨@@ÉÂÔK@@è¤@¨@£@£ @ %@@@@@@@@@@@@@@@@@@¤£¢@@@¢`¥ @¢¢@£¤@ÉÂÔ@¤¢ ¢¢@ %@@@@@@@@@@@@@@@@@@×£ ¢K %@@@@@@@@@@@@@@@@@@@@@@@@É£@@£ @ÉÂÔ@梣£@ ¢ % % % %@@@@@@@@@@@@@@@@@@@@@@@@È£¢ % % %@@@@@@@@@@@@@@@@@@@@⤣@@£@óò`£@@öô`£@ÔÃÁÄ@óÄ@¢@ %@@@@@@@@@@@@@@@@@@@@£¢@ %@@@@@@@@@@@@@@@@@@@@ؤ £@ £@@@ ¢¢ @ @ %@@@@@@@@@@@@@@@@@@@@Å¢¨@£@¢£@ %@@@@@@@@@@@@@@@@@@@@Ä ¢ @@¢£¦ @ ¥ £@@¢£¦ @¢¤£@ %@@@@@@@@@@@@@@@@@@@@ ¥ £¢@ %@@@@@@@@@@@@@@@@@@ã @ ¦@ÉÂÔ@É£ ⣣V@×ÖæÅÙ9@ñøõ@ŧ ¢¢@¦¢££@ %@@@@@@@@@@@@@@@@@@ ¢@@ @@¢@ ¢¢ @ @¦£@ ¢@ %@@@@@@@@@@@@@@@@@@ £¤ ¢@@ § £@ @@@¤ £@¤¢ @ %@@@@@@@@@@@@@@@@@@ § K@ä¢ @£@@` @Ô @ä£ @Á @ %@@@@@@@@@@@@@@@@@@Ä ¢@MÔÃÁÄ]k@ä£ @Á @Å @MÃÁÅ]k@óÄ@@ %@@@@@@@@@@@@@@@@@@ ¢¢@@£ @¤¢ ¢¢@@£ @£¢K@æ£@ %@@@@@@@@@@@@@@@@@@òÄ@¢@ £¢k@£@¢@@ § £@¦¢££@@ %@@@@@@@@@@@@@@@@@@¢£¦ @ ¥ £@@¢¤£ % %@@@@@@@@@@@@@@@@@@æ£@ ` @ÉÂÔ@צ ×ÃV@ù÷ð@ ¢¢@£ ¨@@ %@@@@@@@@@@@@@@@@@@£ @Á£å 9k@£ @¥ ¨@ @öô`£@¢¨ £@ %@@@@@@@@@@@@@@@@@@¤£ ¢¢@É£ ⣣@×ÖæÅÙ@ñøõ@ŧ ¢¢@¦¢££@ %@@@@@@@@@@@@@@@@@@ ¢@¢£@ a @ £¢K@É£@¢@£ @ %@@@@@@@@@@@@@@@@@@ £@ £@£@£ @ÉÂÔ@É£ ⣣@×ÖæÅÙ@òøõ@ %@@@@@@@@@@@@@@@@@@ŧ ¢¢ @@ @@ @@ K@Á@ %@@@@@@@@@@@@@@@@@@¤@ £ @£ @ÁÉç@õÓ9@@Ó¤§V@ £@¢¨¢£ @MÖâ]k@ %@@@@@@@@@@@@@@@@@@£¢@@ ¢¨@¢£O % %@@@@@@@@@@@@@@@@@@É£ ⣣@×ÖæÅÙ@¦¢££¢@ @£ @¢£@ @ %@@@@@@@@@@@@@@@@@@ ¢@¦¢££¢@¢¤£@öô`£@ÃÁãÉÁ@åõ@@ @ %@@@@@@@@@@@@@@@@@@ @¢¢ ¢K@Æ@ÃÁãÉÁ@ÔÃÁÄ@¦¢k@£ @×ÖæÅÙ@ %@@@@@@@@@@@@@@@@@@ñøõ@ŧ ¢¢@¥ ¢@ @ @£@£¢@ ¢¢@@ %@@@@@@@@@@@@@@@@@@ñ@k@£ @ÉÂÔ@É£ ⣣@×ÖæÅÙ@ò÷õ@ñKð@ÇÈ©k@¦ @ £@ %@@@@@@@@@@@@@@@@@@ ¥ @ ¢¢@¢ K@¨@ ¤@£ @ £¨@ k@£ @×ÖæÅÙ@ñøõ@ %@@@@@@@@@@@@@@@@@@ŧ ¢¢@¦ ¢@£ @¤ @£@¤ @k@¢¦ @ÔÃÁÄ@ ¢@ %@@@@@@@@@@@@@@@@@@@¨¢¢@¦¢££¢K % % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ã@ £¤ ¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ä ¢¢ @¦¢££@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ñ`@@ò` @âÔ×@ ¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@öô`£@צ ×Ã@£ ¨@¦£@Á£å 9@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ §£ ¢¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@à @@òÄ@¢@£ @@óÄ@¢@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ £¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@⤣ @¨@ÁÉç@õÓ9@MåõKò@@åõKó]@@Ó¤§@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@¢£¤£¢@@Ù @È£@MÙÈÅÓ@Áâ@ô]@@âäâÅ@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ó¤§@MâÓÅâ@ù] %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@Ȧ @¢¤¨ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ä ¢¢ @ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ö @@£¦@öô`£@òKõ@ÇÈ©@צ ×Ã@ù÷ð@ ¢¢¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ñÔÂ@Óò@ @ @ ¢¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@õñòÔÂ@£@øÇÂ@@óóó@ÔÈ©@ÄÄÙ@âÄÙÁÔ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ƥ@×ÃÉ`ç@£ @¢£¢@Mòöö@ÔÈ©@Möô`£]^@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@òñóó@ÔÈ©@Möô`£]]^@Ö @×ÃÉ@¢£@Móó@ÔÈ©@ %Móò`£]] %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ã @âÃâÉ@¢@¨¢@@¤@£@ðKùãÂ@@£ @ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@¢£ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ĥ@ @ä£óòð@âÃâÉ@£ @M£ @ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ §£ ] %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ĥ@£ @Å£ £@ñðañððañððð@Ô¢@£ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ƥ@äâÂk@£¦@¢ @£¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ã¦@`@ @¨¢ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@óÄ@¢@ ¢@@Ççãôõðð×@@Ççãöõðð× %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@óÄ@¢@¤£@ ¥ ¢â ÂV@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@â Ô¤¢ V %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@òÄ@¢@¤¢@Ççãñóõ× %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ % % % % %@@@@@@@@@@@@@@@@@@@@ % % % %@@@@@@@@@@@@@@@@@@@@@ñ@¢ @@â×Åäòððð@ @£ ¢£¢@¢@@Æ ¤¨@ñôk@ %@@@@@@@@@@@@@@@@@@òððö@¢¤££ @£@â£@× @Å¥¤£@ã@ %@@@@@@@@@@@@@@@@@@@£@ @¥ @£@¦¦¦K¢ KK %@@@@@@@@@@@@@@@@@@@æ } @ @£@ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Å¢¨@¦¨¢@£@ £@£ @¢¦ ¢@¨¤@ K % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ã@¦ % % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ù ¤ ¢£@@¤£ % % % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@¤¢@£ %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ñ`øøø`âÈÖ×`ÉÂÔ % % %@@@@@@@@@@@@@@@@@@@@@@@@ؤ£¨@¢¥¢ % % %@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ó @ % % % % %@@@@@@@@@@@@@@@@@@@@@@@@ã ¢¥@¢£ ¢ % %@@@@@@@@@@@@@@@@@@@@@@@@à £ @¤£ @¢@@¨¤@ÉÂÔ@¢¨¢£ ¢ % % %@@@@@@@@@@@@@@@@@@@@@@@@墣@å¢ÃË % % % % %@@Å`@£¢@ ×£@£¢@ Ä@£¢â¥ @£@ KK¤¢Á¤£@ÉÂÔ@×¥¨@ %@@ã£@ã ¢@@¤¢ @ÉÂÔ@Æ ¢ % \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/envi_test_header.hdr URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/envi_test_header.hdr?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/envi_test_header.hdr (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/envi_test_header.hdr Mon Dec 28 23:10:16 2015 @@ -0,0 +1,16 @@ +ENVI +description = { + GEO-TIFF File Imported into ENVI [Fri May 25 14:06:23 2012]} +samples = 2400 +lines = 2400 +bands = 7 +header offset = 0 +file type = ENVI Standard +data type = 2 +interleave = bip +sensor type = Unknown +byte order = 0 +map info = {Sinusoidal, 1.5000, 1.5000, -10007091.3643, 5559289.2856, 4.6331271653e+02, 4.6331271653e+02, , units=Meters} +projection info = {16, 6371007.2, 0.000000, 0.0, 0.0, Sinusoidal, units=Meters} +coordinate system string = {PROJCS["Sinusoidal",GEOGCS["GCS_ELLIPSE_BASED_1",DATUM["D_ELLIPSE_BASED_1",SPHEROID["S_ELLIPSE_BASED_1",6371007.181,0.0]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Sinusoidal"],PARAMETER["False_Easting",0.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",0.0],UNIT["Meter",1.0]]} +wavelength units = Unknown Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/footnotes.docx URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/footnotes.docx?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/footnotes.docx ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/gdas1.forecmwf.2014062612.grib2 URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/gdas1.forecmwf.2014062612.grib2?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/gdas1.forecmwf.2014062612.grib2 ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/headerPic.docx URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/headerPic.docx?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/headerPic.docx ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/headers.mbox URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/headers.mbox?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/headers.mbox (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/headers.mbox Mon Dec 28 23:10:16 2015 @@ -0,0 +1,7 @@ +From envelope-sender-mailbox-name Mon Jun 01 10:00:00 2009 +Return-Path: <[email protected]> +Subject: subject +From: <[email protected]> +Date: Tue, 9 Jun 2009 23:58:45 -0400 + +Test content Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/jxl.xls URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/jxl.xls?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/jxl.xls ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/moby.zip URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/moby.zip?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/moby.zip ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/embedded_then_npe.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/embedded_then_npe.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/embedded_then_npe.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/embedded_then_npe.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,36 @@ +<?xml version="1.0" encoding="UTF-8" ?> + +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <write element="p">main_content</write> + <!-- auto detection wasn't working for some reason; add content-type as + is to trigger mock on the embedded --> + <embedded filename="embed1.xml" content-type="application/mock+xml"> + <mock> + <metadata action="add" name="author">embeddedAuthor</metadata> + <write element="p">some_embedded_content</write> + </mock> + </embedded> + <throw class="java.lang.NullPointerException">another null pointer exception</throw> + +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/example.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/example.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/example.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/example.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,51 @@ +<?xml version="1.0" encoding="UTF-8" ?> + +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <!-- this file offers all of the options as documentation + Parsing should stop at an IOException, of course + --> + + <!-- action can be "add" or "set" --> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <!-- element is the name of the sax event to write, p=paragraph + if the element is not specified, the default is <p> --> + <write element="p">some content</write> + <!-- write something to System.out --> + <print_out>writing to System.out</print_out> + <!-- write something to System.err --> + <print_err>writing to System.err</print_err> + <!-- hang + millis: how many milliseconds to pause. The actual hang time will probably + be a bit longer than the value specified. heavy: whether or not the hang should do something computationally expensive. + If the value is false, this just does a Thread.sleep(millis). + This attribute is optional, with default of heavy=false. + pulse_millis: (required if "heavy" is true), how often to check to see + whether the thread was interrupted or that the total hang time exceeded the millis + interruptible: whether or not the parser will check to see if its thread + has been interrupted; this attribute is optional with default of true + --> + <hang millis="100" heavy="true" pulse_millis="10" interruptible="true" /> + <!-- throw an exception or error; optionally include a message or not --> + <throw class="java.io.IOException">not another IOException</throw> + <!-- perform a genuine OutOfMemoryError --> + <oom/> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/fake_oom.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/fake_oom.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/fake_oom.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/fake_oom.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,25 @@ +<?xml version="1.0" encoding="UTF-8" ?> + +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <throw class="java.lang.OutOfMemoryError">not another oom</throw> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/heavy_hang.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/heavy_hang.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/heavy_hang.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/heavy_hang.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,25 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <write element="p">some content</write> + <hang millis="3000" heavy="true" pulse_millis="100" /> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/nothing_bad.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/nothing_bad.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/nothing_bad.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/nothing_bad.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,26 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Geoffrey Chaucer</metadata> + <write element="p">Whan that Aprille with his shoures soote</write> + <write>The droghte of Marche hath perced to the roote,</write> + <write>And bathed every veyne in swich licour,</write> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,25 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <write element="p">some content</write> + <throw class="java.lang.NullPointerException">another null pointer exception</throw> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer_no_msg.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer_no_msg.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer_no_msg.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/null_pointer_no_msg.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,25 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <write element="p">some content</write> + <throw class="java.lang.NullPointerException"/> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/real_oom.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/real_oom.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/real_oom.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/real_oom.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,24 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <oom/> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,25 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <write element="p">some content</write> + <hang millis="3000" heavy="false" /> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_interruptible.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_interruptible.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_interruptible.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_interruptible.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,25 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <write element="p">some content</write> + <hang millis="3000" heavy="false" interruptible="true" /> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_not_interruptible.xml URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_not_interruptible.xml?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_not_interruptible.xml (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/mock/sleep_not_interruptible.xml Mon Dec 28 23:10:16 2015 @@ -0,0 +1,25 @@ +<?xml version="1.0" encoding="UTF-8" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +<mock> + <metadata action="add" name="author">Nikolai Lobachevsky</metadata> + <write element="p">some content</write> + <hang millis="3000" heavy="false" interruptible="false" /> +</mock> \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/multiline.mbox URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/multiline.mbox?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/multiline.mbox (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/multiline.mbox Mon Dec 28 23:10:16 2015 @@ -0,0 +1,5 @@ +From envelope-sender-mailbox-name Mon Jun 01 10:00:00 2009 +Received: from xxx + by xxx with xxx; date + +Test content Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/pictures.ppt URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/pictures.ppt?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/pictures.ppt ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protect.xlsx URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protect.xlsx?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protect.xlsx ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protectedFile.xlsx URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protectedFile.xlsx?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protectedFile.xlsx ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protectedSheets.xlsx URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protectedSheets.xlsx?rev=1722027&view=auto ============================================================================== Binary file - no diff available. Propchange: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/protectedSheets.xlsx ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/quoted.mbox URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/quoted.mbox?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/quoted.mbox (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/quoted.mbox Mon Dec 28 23:10:16 2015 @@ -0,0 +1,4 @@ +From envelope-sender-mailbox-name Mon Jun 01 10:00:00 2009 + +Test content +> quoted stuff \ No newline at end of file Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/resume.html URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/resume.html?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/resume.html (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/resume.html Mon Dec 28 23:10:16 2015 @@ -0,0 +1,73 @@ + + + <div class="js-helper"> + <style type="text/css">#style_13209008630000000884_BODY{background-color:#FFFFFF;color:#000000;MARGIN:0px 1px;font-family:Tahoma,Arial,Verdana,Sans-Serif}#style_13209008630000000884 TD{font-size:13px;font-family:Tahoma,Arial,Verdana,Sans-Serif;vertical-align:top}#style_13209008630000000884 CAPTION{font-size:13px;font-weight:bold;text-align:left}#style_13209008630000000884 TR.style_13209008630000000884thead TD{font-weight:bold;text-align:center; padding-bottom:6px;padding-top:6px;padding-left:2px;padding-right:2px}#style_13209008630000000884 H1{font-size:24px;margin-bottom:15px;margin-top:5px;display:block;font-weight:normal;}#style_13209008630000000884 H2{font-size:22px;margin-bottom:5px;margin-top:5px;display:block;font-weight:normal;letter-spacing:1px}#style_13209008630000000884 H1.style_13209008630000000884in, #style_13209008630000000884 H2.style_13209008630000000884in, #style_13209008630000000884 H3.style_13209008630000000884in{font-size:100%;margin-bottom:0px;margin-top:0px;di splay:inline;}#style_13209008630000000884 A, #style_13209008630000000884 A.style_13209008630000000884notvisited:visited, #style_13209008630000000884 .style_13209008630000000884notvisited A:visited, #style_13209008630000000884 .style_13209008630000000884menu A:visited{color:#00418F;text-decoration:none}#style_13209008630000000884 A:visited{color:#6699CC;text-decoration:none;}#style_13209008630000000884 A:hover, #style_13209008630000000884 A.style_13209008630000000884notvisited:hover, #style_13209008630000000884 .style_13209008630000000884notvisited A:hover, #style_13209008630000000884 .style_13209008630000000884menu A:hover{color:#990000;text-decoration:underline}#style_13209008630000000884 .style_13209008630000000884bold, #style_13209008630000000884 .style_13209008630000000884bold H1{font-weight:bold}#style_13209008630000000884 .style_13209008630000000884u{text-decoration:underline}#style_13209008630000000884 .style_13209008630000000884gray, #style_13209008630000000884 A.style_13209 008630000000884gray:visited, #style_13209008630000000884 LEGEND{color:#7A7A7A}#style_13209008630000000884 .style_13209008630000000884red, #style_13209008630000000884 A.style_13209008630000000884red:visited{color:#C2311A}#style_13209008630000000884 EM, #style_13209008630000000884 .style_13209008630000000884imp, #style_13209008630000000884 .style_13209008630000000884field_warning{color:#C2311A;font-weight:bold;font-style:normal}#style_13209008630000000884 TABLE.style_13209008630000000884bl_table TR TD{padding:2px; padding-left:10px}#style_13209008630000000884 TD.style_13209008630000000884bl_row_name{color:#555; width:10%}#style_13209008630000000884 TD.style_13209008630000000884vacancydark, #style_13209008630000000884 TR.style_13209008630000000884vacancydark TD, #style_13209008630000000884 TD.style_13209008630000000884resumedark, #style_13209008630000000884 TR.style_13209008630000000884resumedark TD, #style_13209008630000000884 TD.style_13209008630000000884serverdark, #style_1320900863 0000000884 TR.style_13209008630000000884serverdark TD{text-align:center;padding-bottom:3px;padding-top:3px;padding-left:1px;padding-right:1px;font-weight:bold}#style_13209008630000000884 TD.style_13209008630000000884vacancydark, #style_13209008630000000884 TR.style_13209008630000000884vacancydark TD, #style_13209008630000000884 TD.style_13209008630000000884vacancydark A, #style_13209008630000000884 TD.style_13209008630000000884vacancydark A:visited, #style_13209008630000000884 TD.style_13209008630000000884vacancydark A:hover, #style_13209008630000000884 TD.style_13209008630000000884resumedark, #style_13209008630000000884 TR.style_13209008630000000884resumedark TD, #style_13209008630000000884 TD.style_13209008630000000884resumedark A, #style_13209008630000000884 TD.style_13209008630000000884resumedark A:visited, #style_13209008630000000884 TD.style_13209008630000000884resumedark A:hover, #style_13209008630000000884 TD.style_13209008630000000884serverdark, #style_13209008630000000884 TR.style_13209008630000000884serverdark TD, #style_13209008630000000884 TD.style_13209008630000000884serverdark A, #style_13209008630000000884 TD.style_13209008630000000884serverdark A:visited, #style_13209008630000000884 TD.style_13209008630000000884serverdark A:hover{color:#000000;}#style_13209008630000000884 TD.style_13209008630000000884vacancydark, #style_13209008630000000884 TR.style_13209008630000000884vacancydark TD{background-color:#FFDDBB;}#style_13209008630000000884 TD.style_13209008630000000884vacancylight, #style_13209008630000000884 TR.style_13209008630000000884vacancylight TD{background-color:#FFF5EC}#style_13209008630000000884 TD.style_13209008630000000884resumedark, #style_13209008630000000884 TR.style_13209008630000000884resumedark TD{background-color:#D3E9E9;}#style_13209008630000000884 TD.style_13209008630000000884resumelight, #style_13209008630000000884 TR.style_13209008630000000884resumelight TD{background-color:#ECF8F7}#style_13209008630000000884 TD.style_13209 008630000000884serverdark, #style_13209008630000000884 TR.style_13209008630000000884serverdark TD{background-color:#ABC2D5;}#style_13209008630000000884 TR.style_13209008630000000884serverlight TD, #style_13209008630000000884 TD.style_13209008630000000884serverlight{background-color:#E2EBF5}#style_13209008630000000884 TD.style_13209008630000000884blankheader1{font-size:24px; padding:10px}#style_13209008630000000884 TD.style_13209008630000000884blankheader2{font-size:22px; padding:10px}#style_13209008630000000884 TABLE.style_13209008630000000884resumelist TR.thead TD{background-color:#ABC2D5;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist TR.thead TD{background-color:#ABC2D5;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TR.thead TD{background-color:#DBDBDB;}#style_13209008630000000884 TABLE TR.style_13209008630000000884wr TD{background-color:#FFFFFF}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD{bord er-bottom:#DBDBDB 1px solid}#style_13209008630000000884 .style_13209008630000000884list TR TD{background-color:#E2EBF5;padding:5px}#style_13209008630000000884 .style_13209008630000000884list TR.thead TD{background-color:#ABC2D5;color:#555555;text-align:center; padding-bottom:8px;padding-top:8px;padding-left:1px;padding-right:1px;font-weight:bold;}#style_13209008630000000884 .style_13209008630000000884list TR.wr TD{background-color:#F3F7FB}#style_13209008630000000884 A.style_13209008630000000884list_details, #style_13209008630000000884 A.style_13209008630000000884list_details:visited, #style_13209008630000000884 A.style_13209008630000000884list_details:hover{color:#7A7A7A;text-decoration:none;line-height:120%}#style_13209008630000000884 TD.style_13209008630000000884cell, #style_13209008630000000884 TD.style_13209008630000000884c{padding-top:3px;padding-left:5px;padding-right:5px}#style_13209008630000000884 BIG{font-size:24px}#style_13209008630000000884 .style_13209008630000000884smal l, #style_13209008630000000884 SMALL{font-size:85%}#style_13209008630000000884 UL{margin-left:25px;margin-bottom:0px}#style_13209008630000000884 TD.style_13209008630000000884small, #style_13209008630000000884 .style_13209008630000000884verysmall, #style_13209008630000000884 .style_13209008630000000884verysmall INPUT, #style_13209008630000000884 .style_13209008630000000884verysmall SELECT{font-size:11px}#style_13209008630000000884 DIV.style_13209008630000000884localmenu{padding-top:10px;margin-bottom:15px;}#style_13209008630000000884 DIV.style_13209008630000000884localmenu A, #style_13209008630000000884 DIV.style_13209008630000000884localmenu A:visited{text-decoration:underline;font-weight:bold}#style_13209008630000000884 DIV.style_13209008630000000884comment{font-size:85%; background-color:#DDFFDD; padding:4px; border:1px solid #CCC;cursor:default;}#style_13209008630000000884 HR{color:#ABC2D5;background-color:#ABC2D5;height:1px;border:0px solid #ABC2D5}#style_13209008630000000884 DI V.style_13209008630000000884dotsline{font-size:1px; margin-top:4px; margin-bottom:5px; border-bottom:#BACBD7 1px dotted}#style_13209008630000000884 TABLE.style_13209008630000000884rctable TR TD{background-color:#E5EDF7;}#style_13209008630000000884 TD.style_13209008630000000884rc1{padding-top:10px; padding-left:10px;}#style_13209008630000000884 TD.style_13209008630000000884rc2{font-size:1px; width:10px;}#style_13209008630000000884 TD.style_13209008630000000884rc3{height:10px; font-size:1px;}#style_13209008630000000884 TD.style_13209008630000000884rc4{height:10px; font-size:1px;}#style_13209008630000000884 SPAN.style_13209008630000000884super{color:#003398;font-size:150%}#style_13209008630000000884 SPAN.style_13209008630000000884job{color:#FF0000;font-size:150%}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD TABLE.to_site_button{background-color:#99cc00; margin:0px 5px 3px 0px;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD TABLE.to_site_button TD{background-color:#99cc00; font-weight:normal; color:#ffffff; border-bottom:0px; padding-top:6px; padding-right:7px; padding-bottom:6px; padding-left:7px; vertical-align:middle; text-align:center;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD TABLE.to_site_button TD A, #style_13209008630000000884 TABLE.style_13209008630000000884to_site_button TD A:visited{color:#ffffff; text-decoration:none; font-weight:normal;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD TABLE.to_site_button TD A:hover{color:#ffffff; text-decoration:underline; font-weight:normal;}#style_13209008630000000884 .style_13209008630000000884row{clear:left; padding-bottom:4px;}#style_13209008630000000884 .style_13209008630000000884row2{margin-bottom:8px;}#style_13209008630000000884 .style_13209008630000000884col1{float:left; width:140px; color:#555555; margin-right:-145px;}#style_13209008630000000884 .style_13209008630000000884c ol2{margin-left:145px;}#style_13209008630000000884 DIV.style_13209008630000000884resume_rightcol{float:right; width:280px; margin:0px 0px 10px 30px;}#style_13209008630000000884 DIV.style_13209008630000000884blankheader1{font-size:190%;} +</style> + <div id="style_13209008630000000884" class="mr_read__body"> + <base target="_self" href="http://e.mail.ru/cgi-bin/" /> + + <div id="style_13209008630000000884_BODY"> + + + +<style type="text/css" ></style> + + +<table width="100%" cellspacing="0" cellpadding="0" height="100%" border="0" > +<tr ><td > + +</td></tr> +<tr ><td style="padding:5px" height="100%" > +ÐдÑавÑÑвÑйÑе, !<br > +<br > +ÐÑедлагаем Ðам ознакомиÑÑÑÑ Ñо ÑпиÑком заÑегиÑÑÑиÑованнÑÑ ÐºÐ¾Ð¼Ð¿Ð°Ð½Ð¸Ð¹, пÑедÑÑавиÑели коÑоÑÑÑ Ð¿ÑоÑмоÑÑели ÐаÑе ÑезÑме за поÑледние ÑÑÑки.<br > +<br > +<li ><a target="_blank" href="/cgi-bin/link?check=1&cnf=710139&url=http%3A%2F%2;0,0" >Ðомпании, пÑоÑмоÑÑевÑие ÑезÑме â .</a> ÐовÑе: <b >1.</b></li><br > +<br > +ÐÑи ÑÐ²ÐµÐ´ÐµÐ½Ð¸Ñ Ð¿ÑедоÑÑавлÑÑÑÑÑ Ðам иÑклÑÑиÑелÑно Ð´Ð»Ñ Ð¸Ð½ÑоÑмаÑии. ÐÑ Ð¼Ð¾Ð¶ÐµÑе опеÑаÑивно оÑÑлеживаÑÑ, какие именно компании наÑли в базе даннÑÑ Superjob ÐаÑе ÑезÑме и заинÑеÑеÑовалиÑÑ Ð¸Ð¼.<br > +<br > +ÐÑли ÐаÑе ÑезÑме ÑазмеÑено в закÑÑÑом доÑÑÑпе, Ñо его могÑÑ Ð¿ÑоÑмаÑÑиваÑÑ ÑолÑко Ñе ÑабоÑодаÑели, коÑоÑÑм ÐÑ Ð¾ÑпÑавили его ÑамоÑÑоÑÑелÑно.<br > +ÐÑÑоÑÐ¸Ñ Ð¾ÑпÑавки Ñвоего ÑезÑме ÐÑ Ð¼Ð¾Ð¶ÐµÑе поÑмоÑÑеÑÑ Ð¿Ð¾ ÑÑÑлке «ÐÑÑоÑÐ¸Ñ ÑаÑÑÑлки ÑезÑме».<br > +<br > +<br > +<b >Ðнимание!</b><br > +РпÑоÑеÑÑе поиÑка ÑабоÑÑ ÐÑ Ð¼Ð¾Ð¶ÐµÑе ÑÑолкнÑÑÑÑÑ Ñ Ñакими пÑедложениÑми ÑабоÑодаÑелей или кадÑовÑÑ Ð°Ð³ÐµÐ½ÑÑÑв, в коÑоÑÑÑ ÐÐ°Ñ Ð±ÑдÑÑ Ð¿ÑоÑиÑÑ Ð²Ð½ÐµÑÑи оплаÑÑ (за пÑедваÑиÑелÑное обÑÑение, за оÑоÑмление докÑменÑов, за оÑоÑмление обÑзаÑелÑной ÑÑÑÐ°Ñ Ð¾Ð²ÐºÐ¸, на закÑÐ¿ÐºÑ Ð¿ÐµÑвой паÑÑии пÑодÑкÑии ко мпании, пÑедназнаÑенной Ð´Ð»Ñ Ð¿Ñодажи и Ñ.п.) или пÑедоÑÑавиÑÑ Ð¾ÑÑканиÑованнÑе копии докÑменÑов (паÑпоÑÑа, военного билеÑа, ÑÑÑдовой книжки, водиÑелÑÑÐºÐ¸Ñ Ð¿Ñав, пенÑионного ÑдоÑÑовеÑÐ½Ð¸Ñ Ð¸ Ñ.п.) Ð´Ð»Ñ ÑÐºÐ¾Ð±Ñ Ð¿ÑедваÑиÑелÑного оÑоÑÐ¼Ð»ÐµÐ½Ð¸Ñ Ð¸Ð»Ð¸ подÑвеÑÐ¶Ð´ÐµÐ½Ð¸Ñ Ð´Ð°Ð½Ð½ÑÑ , ÑказаннÑÑ Ð² ÐаÑем ÑезÑме.<br > +ÐÑо один из пÑизнаков моÑенниÑеÑÑва! ÐÑ ÑекомендÑем Ðам оÑÐµÐ½Ñ Ð¾ÑÑоÑожно оÑноÑиÑÑÑÑ Ðº Ñаким пÑедложениÑм и по возможноÑÑи избегаÑÑ ÑобеÑедований Ñ Ð¿Ð¾Ð´Ð¾Ð±Ð½Ñми ÑабоÑодаÑелÑми.<br > +<br > +Также Ð¼Ñ Ð½Ð°ÑÑоÑÑелÑно не ÑекомендÑем оÑпÑавлÑÑÑ Ð¿Ð»Ð°ÑнÑе SMS-ÑообÑÐµÐ½Ð¸Ñ Ð½Ð° коÑоÑкие номеÑа Ð´Ð»Ñ Ð¿Ð¾Ð»ÑÑÐµÐ½Ð¸Ñ ÐºÐ¾Ð½ÑакÑов или дÑÑгой инÑоÑмаÑии о ваканÑии или же Ð´Ð»Ñ Ð¿Ð¾Ð»ÑÑÐµÐ½Ð¸Ñ ÑезÑлÑÑаÑов ÑеÑÑиÑованиÑ. С оÑганизаÑиÑми, коÑоÑÑе оказÑваÑÑ Ð¿Ð¾Ð´Ð¾Ð±Ð½Ñе ÑÑлÑги, Ð¼Ñ Ð½Ðµ ÑоÑÑÑдниÑаем и пÑедÑпÑеждаем, Ñ� �о ÑÑо Ñоже один из пÑиемов моÑенниÑеÑÑва.<br > +<br > +<br > +<em >x</em> <a target="_blank" href="/cgi-bin/link?check=1&cnf=8d972a&url=http%3A%2F%2Fwww.sup;0,0" >ÐÑклÑÑиÑÑ ÑÐ²ÐµÐ´Ð¾Ð¼Ð»ÐµÐ½Ð¸Ñ Ð¾ новÑÑ Ð¿ÑоÑмоÑÑÐ°Ñ Ð¼Ð¾Ð¸Ñ ÑезÑме</a><br > +<br > +Ðо ÑÑÑлкам в ÑÑом пиÑÑме можно войÑи в ÑиÑÑÐµÐ¼Ñ Ð±ÐµÐ· ввода паÑолÑ. +<br ><br > +</td> +</tr> +<tr > +<td > +<span class="style_13209008630000000884noprint" ><br ><br >ÐÑли Ñ ÐÐ°Ñ ÐµÑÑÑ Ð¿Ð¾Ð¶ÐµÐ»Ð°Ð½Ð¸Ñ Ð¸ идеи по ÑлÑÑÑÐµÐ½Ð¸Ñ ÑеÑвиÑа Superjob, пожалÑйÑÑа, <a target="_blank" href="/cgi-bin/link?check=1;0,0" >напиÑиÑе нам</a>.<br ><br ></span> +<table width="100%" cellspacing="0" cellpadding="10" border="0" class="style_13209008630000000884noprint" > +<tr ><td align="center" style="border-top:1px solid #BACBD7;" > +<a target="_blank" href="/cgi-bin/link?check=1&cnf=8fa2f9&url=http%3A%2F%2Fwww.;0,0" ><big >Superjob â РабоÑа должна доÑÑавлÑÑÑ ÑдоволÑÑÑвие!</big></a> +</td></tr> +</table> +<table width="100%" cellspacing="1" cellpadding="0" border="0" class="style_13209008630000000884noprint" > +<tr ><td align="center" style="padding:5px" > +<span style="color:#999999;font-size:8pt;" >ÐиÑÑмо оÑпÑавлено: xx.xx.xxxx xx:xx:xx</span> +</td></tr> +</table> + +</td></tr> +</table> + + + +</div> + + + <base target="_self" href="http://e.mail.ru/cgi-bin/" /> + </div> +</div> + + + Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/rsstest.rss URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/rsstest.rss?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/rsstest.rss (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/rsstest.rss Mon Dec 28 23:10:16 2015 @@ -0,0 +1,36 @@ +<?xml version="1.0" encoding="ISO-8859-1" ?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<rss version="0.91"> + <channel> + <title>TestChannel</title> + <link>http://test.channel.com/</link> + <description>Sample RSS File for Junit test</description> + <language>en-us</language> + + <item> + <title>Home Page of Chris Mattmann</title> + <link>http://www-scf.usc.edu/~mattmann/</link> + <description>Chris Mattmann's home page</description> + </item> + <item> + <title>Awesome Open Source Search Engine</title> + <link>http://www.nutch.org/</link> + <description>Yup, that's what it is</description> + </item> + </channel> +</rss> Added: tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/russian.cp866.txt URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/russian.cp866.txt?rev=1722027&view=auto ============================================================================== --- tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/russian.cp866.txt (added) +++ tika/branches/2.x/tika-parser-test/src/main/resources/test-documents/russian.cp866.txt Mon Dec 28 23:10:16 2015 @@ -0,0 +1,6 @@ +¤ ¦¤ë, ¢ áâ㤥ãî §¨¬îî ¯®àã, + ¨§ «¥áã ¢ë襫; ¡ë« ᨫìë© ¬®à®§. +«ï¦ã, ¯®¤¨¬ ¥âáï ¬¥¤«¥® ¢ £®àã +®è ¤ª , ¢¥§ãé ï 墮à®áâã ¢®§. +, è¥áâ¢ãï ¢ ¦®, ¢ ᯮª®©á⢨¨ 種¬, +®è ¤ªã ¢¥¤¥â ¯®¤ 㧤æë ¬ã¦¨ç®ª
