I think I'm starting to understand what you are trying to get…
You don't want original content but only extracted content, right?

I think that if you store content it should work.

Something like this (in mapping):

{
    "person" : {
        "properties" : {
            "file" : {
                "type" : "attachment",
                "fields" : {
                    "file" : {"index" : "no", "store" : "yes"}
                }
            }
        }
    }
}

And then when search, ask for field "file.file" instead of _source (default): 
curl -XGET 
'http://localhost:9200/index/person/_search?q=whatever&fields=file.file'

Should work I guess.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 20 mars 2014 à 10:12:01, sAs59 ([email protected]) a écrit:

It's still unclear, I've decoded my whole text and instead I'm getting this 
kind of text.
Where should I see my actual text?
I also tried using different charset, but still unclear.

<</Filter/FlateDecode/Length 1549>>
stream
xœ­XKoÛF ¾ ð Б â –\.Ék€8MÑ^
÷ $=Ð % –-—”ìôßwfvgw–‘" ( 8Ü÷7¯ofôáîúêý­ži£æfv·º¾Ò³9üÓ³¦R ¦êºP• 
Ý=]_Ígküóéúêkv—›ì!¿)³~–ßh“½Áx‡ã!o²-~,ñ ,VÙ ¿Æ0\À9“u°ï ­q~ að o,² 'ø xa èEw >Ö°Á 
¤ ßÿB06 !ØÓv„3c¼xµC< ,í‘b-aÜ¿âzOrù;_àã)o³þ —öñ.Z]ÑU#o^ ”ž6ý“ë2SN¾?avd8³ü¯Ùݯ×W 
Á î~4BUªÖ ¾Æ7J[EùWp‹“÷)×uÖí ^áÏŽ·Ð C2ö`„ÒÍâr l PúÍÝbÑoQ«ˆrèèìˆBãz% 
¶aqüATÑ@šEÃõ#/+Z/²Ïh^¯ú ±9 Ø›±wï/ù}ëÜH>Û] ̲RÆze. Ú’@ì‚çz—au¼;q§®
U¦Wžz^WVÙ"ÝÛ‘ …P©£§ŽqΩqËn 3Rj ºÿ.•E¼Dj^}—×Ñ GŽÂª¢¸ ö• ’H ñ+Œ;Úp@ ¹ÉàªôÞ…žjÎ 
P[Õ6^ƒKFMaß;Ò ®¨Ý[Ïqœ §1¿Ox¼^L` 3 ”³$t8•Ü ã Iå ÞO^_¹oTÁ^’¡G3 c“éà}Á) 
+µàZrn|mÍ!A׿åÆãatáÕ€ŒÅ#59C~÷ü™x Jë ò¬!lÛ¨’
Ñå7 p¼ «‘u d PÕæ¿ WíµÓ= 3 Õ&5 Œÿ†ñ!qå½—sÇ ÜF‰fÅ hùC:r Gÿ wìqÄs,B ’”Ì1 ä. 
‘U)âŒÜ´ñf<§õºU-+ ¡M1I^¥WÃ(g‚Ì8p¼Š’ ©' | G¡KÕ´)Ž-ç@¾·wª0ç’ œ= ~“¤?\Þ ?ÀñVÚ’.ë 
ÿô¤h8¢ G’£pÌT/p&PÊ+ $‰_ Äy[Y­Lá•4:MxŸßsäv b³Ö;‰ i+”¡# †à@à?Nm" DN¿ ª 
]l™}„ñw6û(} ­«|‚ »E’ëéz ÔU_¤äWVÖÒg k½7v  ˆ§þ¿ä`M K¥‘ R$>è¼Ùm#Ì^O2 
NÐÎΑrØÃ*pé†jÕ:I“ ^ý §E Þ‰6å ][BI·cÌô Y–*E †[HéAÔÝMùœÁœ· >8 – ¤åWºñ 5 
F•¬æ/¹‘•Fy jëì ‡ô>" h¥É>!È i J¿L÷>ȨÀù–kËÄÃŽ£-‹Bé*EK†™Ï…ÏáUGü-f x3TG©ï¶Z '~ cÒ 
U®Ý=w>i­åö f8§úy¥šÒ óH ± Ñ‚- Zˆ À0pÖy‘ µLI IÊ Kú!÷þßqGõ V 
½X¦üþÛO\§,¬2uŠÿæÔÞR“áäÞ“÷–FÕ“½$`· í
zT™šÆBÞ‰% J²C*hB)Õû>.a +IöHûr9SUM­ÊÊãý–u‡¼Œ‰x'â'åÑ Ïøà“ÜCsÂk[O#,åà] :€ 
ðµt_[DþqÁì¶^fÚªEÝ'" 4­5ªÒéÞ“÷ÚV™É½lZW šì[î¥YzÑq~
½"É Ëˆ ÐCHóƒŒÆ6):` uu>@+Û ?:´Ÿ}9 ¤þ îCoPÎÁ ï„è ÅâÁ»Q·d ± î¹j£ ¡h|“`Ò 
[€þ"%;²ÇÁ…ÐÌ—“ž "Ð ˆ£ä " Ý*= ù•I Ñ/ø®Ø ÁÓÄSo! ! … ý\íÕ\ õ´-tÆÝú$òÂi®¨D¯B ˜.lÖ¯ 
_lüéçH âP eÇa9Š=±†Á M ¹‰æ¥ŽïÀ¿ŒˆjK ÅEY¼ - ¾ƒ:‡ÎbÌ£ àôžIÉŸYF7 ?®ÐÌ}îÊð}ô±ó< 
T]s#àlê\m—ûò1h²÷MrlLf¹Ö'ÊÖæØOBj‚åým1ÓzúÛeQ¶jަȤ ÿ òˆ©
endstream
endobj
5 0 obj
<</Type/Font/Subtype/TrueType/Name/F1/BaseFont/Times#20New#20Roman/Encoding/WinA

View this message in context: Re: searching pdf files by content with 
Mongodb-river
Sent from the ElasticSearch Users mailing list archive at Nabble.com.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2B5_B1CzWZCxFbYL_akVm%2B%2Bjh%2BwQj-NXsAgedTsp3sLbUtNpKw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.532ab87c.9daf632.97ca%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Reply via email to