Hi Karl, I have an error as follow:
FATAL 2017-02-07 23:56:09,483 (Worker thread '29') - Error tossed: For input string: "myFolder/test:<CADNgPDgSXHeWo0GDnUL6S2sogUsXUa9mx2WxOT23Wi3 [email protected]>" java.lang.NumberFormatException: For input string: "myFolder/test:< cadngpdgsxhewo0gdnul6s2sogusxua9mx2wxot23wi37hog...@mail.gmail.com>" at java.lang.NumberFormatException.forInputString( NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.manifoldcf.crawler.connectors.email.EmailConnector. processDocuments(EmailConnector.java:705) at org.apache.manifoldcf.crawler.system.WorkerThread.run( WorkerThread.java:399) 2017-02-07 22:50 GMT+03:00 Cihad Guzel <[email protected]>: > Thanks Karl, > > I will try it. > > Regards > Cihad Guzel > > 2017-02-07 22:36 GMT+03:00 Karl Wright <[email protected]>: > >> I've created a ticket and attached a patch to it. CONNECTORS-1375. >> Please let me know if it works for you; if not, I'll fix what doesn't work. >> >> Karl >> >> >> On Tue, Feb 7, 2017 at 1:19 PM, Karl Wright <[email protected]> wrote: >> >>> Correction: the only metadata attribute we set is the attachment(s) >>> mimetype (as a multivalued field) -- this doesn't currently include the >>> attachment data. >>> >>> Karl >>> >>> >>> On Tue, Feb 7, 2017 at 1:14 PM, Karl Wright <[email protected]> wrote: >>> >>>> Hi Cihad, >>>> >>>> The email connector is providing the attachment data unextracted to the >>>> output connector as metadata attribute data. There are no transformation >>>> connectors that look at this metadata. Solr cell also probably does not >>>> handle binary in random metadata attributes the proper way. >>>> >>>> The connector's attachment code therefore seems to be designed only to >>>> deal with textual attachments. The right solution is to have individual >>>> IDs for each attachment. But that would also require there to be a URL we >>>> could construct for each attachment. We could provide an additional URI >>>> template for attachments, but I'd wonder if your system has the ability to >>>> serve attachments by their own URLs. Please let me know if this would work >>>> and if so I can create a ticket and work on making these changes. >>>> >>>> Thanks, >>>> Karl >>>> >>>> >>>> On Tue, Feb 7, 2017 at 12:56 PM, Cihad Guzel <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> I try the email connector with gmail. I attach the file [1] in my new >>>>> email. And sent to my test email adress. >>>>> >>>>> My mail content body is like: "this is test mail for mfc" >>>>> >>>>> Then I run my email job and the email is indexed to Solr successfully. >>>>> But, the solr's content field have not my attachment's content body. Solr >>>>> content filed looks like: >>>>> >>>>> "content":" \n \n \n \n \n \n \n \n \n \n >>>>> --94eb2c1910841bc55f0547f43443\r\nContent-Type: >>>>> multipart/alternative; boundary=94eb2c1910841bc553054 >>>>> 7f43441\r\n\r\n--94eb2c1910841bc5530547f43441\r\nContent-Type: >>>>> text/plain; charset=UTF-8\r\n\r\nthis is test mail for >>>>> mfc.\r\n\r\n--94eb2c1910841bc5530547f43441\r\nContent-Type: >>>>> text/html; charset=UTF-8\r\n\r\n<div dir=\"ltr\">this is test mail for >>>>> mfc.\r\n</div>\r\n\r\n--94eb2c1910841bc5530547f43441--\r\n-- >>>>> 94eb2c1910841bc55f0547f43443\r\nContent-Type: application/pdf; >>>>> name=\"pdf-test.pdf\"\r\nContent-Disposition: attachment; >>>>> filename=\"pdf-test.pdf\"\r\nContent-Transfer-Encoding: >>>>> base64\r\nX-Attachment-Id: f_iyvt78qa0\r\n\r\nJVBERi0xLjY >>>>> NJeLjz9MNCjM3IDAgb2JqIDw8L0xpbmVhcml6ZWQgMS9MIDIwNTk3L08gNDA >>>>> vRSAx\r\nNDExNS9OIDEvVCAxOTc5NS9IIFsgMTAwNSAyMTVdPj4NZW5kb2J >>>>> qDSAgICAgICAgICAgICAgICAg\r\nDQp4cmVmDQozNyAzNA0KMDAwMDAwMDA >>>>> xNiAwMDAwMCBuDQowMDAwMDAxMzg2IDAwMDAwIG4NCjAw\r\nMDAwMDE1MjIgMDAwM >>>>> ..." >>>>> >>>>> Does the MFC email connector know that the attachment's file type is >>>>> pdf? Does not extract the contents? >>>>> >>>>> [1] http://www.orimi.com/pdf-test.pdf >>>>> -- >>>>> Regards >>>>> Cihad Güzel >>>>> >>>> >>>> >>> >> > > > -- > Teşekkürler > Cihad Güzel > -- Teşekkürler Cihad Güzel
