[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867244#comment-13867244 ] Peter Ansell commented on TIKA-1217: Nick: New version of the patch uses Path.toFile()

[jira] [Updated] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Ansell updated TIKA-1217: --- Attachment: TIKA-1217-v2.patch New version of patch checking File instead of InputStream > Integrate

[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867228#comment-13867228 ] Nick Burch commented on TIKA-1217: -- Minor thing, but the section "// Then open an InputStr

[jira] [Updated] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Ansell updated TIKA-1217: --- Attachment: TIKA-1217.patch Patch to add FileTypeDetector implementation > Integrate with Java-7 File

[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867207#comment-13867207 ] Peter Ansell commented on TIKA-1217: Patch can also be reviewed at GitHub: https://git

[jira] [Commented] (TIKA-1216) parse method of Mp3Parser doesn't work for few mp3 files

2014-01-09 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866916#comment-13866916 ] Tim Allison commented on TIKA-1216: --- Agreed. I didn't think this was a duplicate. It is

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: BTW, may I'm wrong to say that thumbnail handling in Alfresco is quite complex because Alfresco can call external thumbnail generation with PDFBox or PDFRender It can do, yes, but there are also dedicated classes to pull out most of the common

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Hong-Thai Nguyen
Thank alot Nick, That's a great reference. BTW, may I'm wrong to say that thumbnail handling in Alfresco is quite complex because Alfresco can call external thumbnail generation with PDFBox or PDFRender I'm defining DoD by retainning some main features from this in TIKA-90. Could you guide

[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866722#comment-13866722 ] Jukka Zitting commented on TIKA-1217: - Nice idea! I think putting such a feature to a

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: I'm convinced that using embedded resources is a better solution. OK, sounds like we have a consensus and can go ahead with it, great! One outstanding query is what name we should give to these when we return them as embedded resources, and if we sh

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Hong-Thai Nguyen
I'm convinced that using embedded resources is a better solution. Thank Nick @Matt, I ignored that we had a reflect on metadata structure. Interesting. We would adapt TIKA-90 title & description. I hope provide an initiative on this work. Hong-Thai -Message d'origine- De : Nick Burch [

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: I agree with you that metadata is not the best place to store thumbnail result. Until now, our metadata is simple map with key:values. This structure is not really flexiable in some cases. Currently, we have four kinds of "things" that we return for

Re: Extract thumbnail from openxml office files

2014-01-09 Thread Mattmann, Chris A (398J)
Hi Hong-Thai, +1 to using cardinality to help denote more complex metadata relationships at least until we get past prior discussions on Metadata and name spacing. See the wiki here for some prior past thoughts: http://wiki.apache.org/tika/MetadataDiscussion I know our met structure is simple -

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Hong-Thai Nguyen
Hi Nick, You're begining a very interesting topic about foundation of our metadata concept :) I agree with you that metadata is not the best place to store thumbnail result. Until now, our metadata is simple map with key:values. This structure is not really flexiable in some cases. For exemple,

Re: [DISCUSS] Prepare Release 1.5?

2014-01-09 Thread Chris Mattmann
Hey Dave, I kind of got bogged down and haven't had time to release. If someone else does have time and wants to pick this up, +1 for it! Cheers, Chris -Original Message- From: David Meikle Reply-To: "dev@tika.apache.org" Date: Thursday, January 9, 2014 3:46 AM To: "dev@tika.apache.

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: By searching on issues, I found the issue already created: https://issues.apache.org/jira/browse/TIKA-90 I'm not sure if the metadata is the right place to return this. Some formats offer a small thumbnail, others can offer a small thumbnail for ev

[jira] [Commented] (TIKA-90) Allow thumbnails as document metadata

2014-01-09 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866498#comment-13866498 ] Hong-Thai Nguyen commented on TIKA-90: -- Useful for Open XML Office & OpenOffice files an

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Hong-Thai Nguyen
Hi Ray & all, By searching on issues, I found the issue already created: https://issues.apache.org/jira/browse/TIKA-90 It's maybe now the time to realize it. Thanks, Hong-Thai -Message d'origine- De : Ray Gauss II [mailto:ray.ga...@alfresco.com] Envoyé : mercredi 8 janvier 2014 11:49

[jira] [Reopened] (TIKA-1216) parse method of Mp3Parser doesn't work for few mp3 files

2014-01-09 Thread Sumeet Gorab (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumeet Gorab reopened TIKA-1216: Hi Tim Allison Reported bug is not the duplicate of TIKA-1215, becasue in TIKA-1215 parse method gives

Re: [DISCUSS] Prepare Release 1.5?

2014-01-09 Thread David Meikle
Hi, On 29 Dec 2013, at 11:41, David Meikle wrote: > Hi Guys, > > There have been some questions pop up around when a new 1.5 release will be > available. > > I have some free cycles over the next couple of weeks to prepare one and I > believe Chris has some too, so in preparation for that w