Re: [DISCUSS] Prepare Release 1.5?

2014-01-09 Thread David Meikle
Hi, On 29 Dec 2013, at 11:41, David Meikle loo...@gmail.com wrote: Hi Guys, There have been some questions pop up around when a new 1.5 release will be available. I have some free cycles over the next couple of weeks to prepare one and I believe Chris has some too, so in preparation

[jira] [Reopened] (TIKA-1216) parse method of Mp3Parser doesn't work for few mp3 files

2014-01-09 Thread Sumeet Gorab (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumeet Gorab reopened TIKA-1216: Hi Tim Allison Reported bug is not the duplicate of TIKA-1215, becasue in TIKA-1215 parse method

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Hong-Thai Nguyen
Hi Ray all, By searching on issues, I found the issue already created: https://issues.apache.org/jira/browse/TIKA-90 It's maybe now the time to realize it. Thanks, Hong-Thai -Message d'origine- De : Ray Gauss II [mailto:ray.ga...@alfresco.com] Envoyé : mercredi 8 janvier 2014 11:49

[jira] [Commented] (TIKA-90) Allow thumbnails as document metadata

2014-01-09 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866498#comment-13866498 ] Hong-Thai Nguyen commented on TIKA-90: -- Useful for Open XML Office OpenOffice files and

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: By searching on issues, I found the issue already created: https://issues.apache.org/jira/browse/TIKA-90 I'm not sure if the metadata is the right place to return this. Some formats offer a small thumbnail, others can offer a small thumbnail for

Re: [DISCUSS] Prepare Release 1.5?

2014-01-09 Thread Chris Mattmann
Hey Dave, I kind of got bogged down and haven't had time to release. If someone else does have time and wants to pick this up, +1 for it! Cheers, Chris -Original Message- From: David Meikle loo...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, January 9,

Re: Extract thumbnail from openxml office files

2014-01-09 Thread Mattmann, Chris A (398J)
Hi Hong-Thai, +1 to using cardinality to help denote more complex metadata relationships at least until we get past prior discussions on Metadata and name spacing. See the wiki here for some prior past thoughts: http://wiki.apache.org/tika/MetadataDiscussion I know our met structure is simple

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: I agree with you that metadata is not the best place to store thumbnail result. Until now, our metadata is simple map with key:values. This structure is not really flexiable in some cases. Currently, we have four kinds of things that we return for

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Hong-Thai Nguyen
I'm convinced that using embedded resources is a better solution. Thank Nick @Matt, I ignored that we had a reflect on metadata structure. Interesting. We would adapt TIKA-90 title description. I hope provide an initiative on this work. Hong-Thai -Message d'origine- De : Nick Burch

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: I'm convinced that using embedded resources is a better solution. OK, sounds like we have a consensus and can go ahead with it, great! One outstanding query is what name we should give to these when we return them as embedded resources, and if we

[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866722#comment-13866722 ] Jukka Zitting commented on TIKA-1217: - Nice idea! I think putting such a feature to a

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Hong-Thai Nguyen
Thank alot Nick, That's a great reference. BTW, may I'm wrong to say that thumbnail handling in Alfresco is quite complex because Alfresco can call external thumbnail generation with PDFBox or PDFRender I'm defining DoD by retainning some main features from this in TIKA-90. Could you guide

RE: Extract thumbnail from openxml office files

2014-01-09 Thread Nick Burch
On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote: BTW, may I'm wrong to say that thumbnail handling in Alfresco is quite complex because Alfresco can call external thumbnail generation with PDFBox or PDFRender It can do, yes, but there are also dedicated classes to pull out most of the common

[jira] [Commented] (TIKA-1216) parse method of Mp3Parser doesn't work for few mp3 files

2014-01-09 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866916#comment-13866916 ] Tim Allison commented on TIKA-1216: --- Agreed. I didn't think this was a duplicate. It is

[jira] [Updated] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Ansell updated TIKA-1217: --- Attachment: TIKA-1217.patch Patch to add FileTypeDetector implementation Integrate with Java-7

[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867207#comment-13867207 ] Peter Ansell commented on TIKA-1217: Patch can also be reviewed at GitHub:

[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867228#comment-13867228 ] Nick Burch commented on TIKA-1217: -- Minor thing, but the section // Then open an

[jira] [Updated] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Ansell updated TIKA-1217: --- Attachment: TIKA-1217-v2.patch New version of patch checking File instead of InputStream Integrate

[jira] [Commented] (TIKA-1217) Integrate with Java-7 FileTypeDetector API

2014-01-09 Thread Peter Ansell (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867244#comment-13867244 ] Peter Ansell commented on TIKA-1217: Nick: New version of the patch uses Path.toFile()