[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614728#comment-14614728 ] Satya Deep Maheshwari commented on SLING-4694: -- For being able to use this improvement, I think it should be released so that it can be utilized outside sling without snapshot dependencies? Also can a link please be added to download it from [1]. [1] - https://sling.apache.org/downloads.cgi Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Assignee: Bertrand Delacretaz Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614945#comment-14614945 ] Bertrand Delacretaz commented on SLING-4694: I have started the org.apache.sling.commons.contentdetection/1.0.2 release vote Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Assignee: Bertrand Delacretaz Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615105#comment-14615105 ] Satya Deep Maheshwari commented on SLING-4694: -- Thanks for releasing. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Assignee: Bertrand Delacretaz Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594555#comment-14594555 ] ASF GitHub Bot commented on SLING-4694: --- Github user sdmcraft closed the pull request at: https://github.com/apache/sling/pull/89 Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Assignee: Bertrand Delacretaz Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562449#comment-14562449 ] Satya Deep Maheshwari commented on SLING-4694: -- [~bdelacretaz], thanks for your guidance in taking this to a conclusion. Marking closed. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Assignee: Bertrand Delacretaz Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562602#comment-14562602 ] Bertrand Delacretaz commented on SLING-4694: I have added some basic docs about this new bundle to http://sling.apache.org/documentation/bundles/mime-type-support-commons-mime.html Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Assignee: Bertrand Delacretaz Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561045#comment-14561045 ] Bertrand Delacretaz commented on SLING-4694: Thanks very much for your contribution, I have committed the new content detection module in http://svn.apache.org/r1682035 There's still some minor cleanup left that I'll do until tomorrow hopefully, but the module can be used for your webdav servlet enhancements already. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554108#comment-14554108 ] Satya Deep Maheshwari commented on SLING-4694: -- Attached patch containing contentdetection module. Here's the related commit: https://github.com/sdmcraft/sling/commit/83f396c1c24ed9509dfc43fa67cdcd87eff16c20 Logged SLING-4732 for tracking the webdav servlet enhancment. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari Attachments: SLING-4694.patch *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549984#comment-14549984 ] Bertrand Delacretaz commented on SLING-4694: Ok, splitting this into contendetection (here) and webdav servlet in another ticket sounds good! Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549909#comment-14549909 ] Satya Deep Maheshwari commented on SLING-4694: -- Apart from that, are you planning more changes to your bundles/commons/contentdetection module? Nothing apart from a few clean up tasks around UTs and docs. And some code cleanup as per https://maven.apache.org/developers/conventions/code.html. I'll complete them and create a separate PR with just the contentdetection module changes. I would be delighted if this module can be useful for others and can become part of sling. On the webdav servlet, the check is not only in the 'activate' method but also in 'update' method and hence gets called whenever 'detectionmode' is changed or the ContentAwareMimeTypeService get's binded/unbinded. Do you think it mitigates your concern on the service becoming available/unavailable later? I'd rather use a detectMimeType method which decides to use ContentAwareMimeTypeService or MimeTypeService right when the detection happens, to be more dynamic. 'detectMimeType is called by the SlingTikaDetector which does not know about the detectionMode set on it invoker somewhere down the call stack. Could you please provide some guidance as to how 'detectMimeType' can decide dynamically? Maybe I'll log a separate jira for this webdav related change and we can take it forward there and limit this jira for the contentdetection module only. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548046#comment-14548046 ] Bertrand Delacretaz commented on SLING-4694: Testing the optional ContentAwareMimeTypeService reference in the SimpleWebdavServlet's activate method looks fragile to me, as the reference is optional it might be set later, AFAIK. I'd rather use a detectMimeType method which decides to use ContentAwareMimeTypeService or MimeTypeService right when the detection happens, to be more dynamic. Setting the detectionMode based on configuration in the activate method is fine. Apart from that, are you planning more changes to your bundles/commons/contentdetection module? If not and if you want to donate it I could commit it and we can look at the webdav module changes later. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543453#comment-14543453 ] Satya Deep Maheshwari commented on SLING-4694: -- I have added the proposed config for mime type detection mode in webdav servlet. Please review https://github.com/sdmcraft/sling/commit/bc64ae883116274d09f10aea5cefdaa1686b795b Meanwhile I am working on adding some test coverage for webdav to thoroughly validate these changes. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541784#comment-14541784 ] Satya Deep Maheshwari commented on SLING-4694: -- I have updated the pull request as per the review comments. Details below: * Removed empty activator * Removed version of exported packages in pom * Removed MimeDetectionConstants * Removed DETECTION_MODE from the basic MimeTypeServiceImpl * Removed all changes from mime bundle. No changes needed whatsoever. * Added optional reference for ContentAwareMimeTypeService in webdav servlet. webdav servlet would give preference to ContentAwareMimeTypeService if available else it would fall back to basic MimeTypeService metatype info is not in the component itself [~cziegeler], can you please explain, what needs to be done for your above comment? Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541849#comment-14541849 ] Satya Deep Maheshwari commented on SLING-4694: -- OK, I see what you say. IIUC, you would want a flexibility in the service consumer to indicate their service requirement as optional/mandatory/not-needed. I thought that this decision of whether to use this service (optionally or compulsorily) or not at all would be made at the time of writing the code itself. Like the webdav servlet enforces it in its code to use it optionally if available. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541901#comment-14541901 ] Carsten Ziegeler commented on SLING-4694: - Right, that's one way of doing it - but I assume the webdav servlet is pretty generic and there are only some use cases where the new detection is needed. Putting it in there puts a penality on all users. If that's acceptable to everyone we can of course go this way Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541957#comment-14541957 ] Bertrand Delacretaz commented on SLING-4694: I agree with making the mime type detection mode configurable in the webdav servlet, with 3 values as Carsten suggests: * a) Use MimeTypeService only * b) Prefer ContentAwareMimeTypeService but if not available fallback to MimeTypeService * c) Require ContentAwareMimeTypeService And make b) the default, it is backwards compatible as ContentAwareMimeTypeService won't be present unless explicitely made available by installing the corresponding bundle. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539487#comment-14539487 ] Bertrand Delacretaz commented on SLING-4694: Ok, MimeDetectionConstants.DETECTION_MODE makes sense for ContentAwareMimeTypeService then, but it's not needed in MimeTypeService. Conceptually I think this extension shouldn't require any changes to the commons/mime bundle, so I suggest adding the DETECTION_MODE constant to the ContentAwareMimeTypeService interface and use a lowercase tika value for the implementation that you suggest, without defining a constant for that tika value. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539412#comment-14539412 ] Bertrand Delacretaz commented on SLING-4694: Thanks, the patch looks good to me, there's a few minor things to adjust but it looks useful! I think we might not need the MimeDetectionConstants.DETECTION_MODE service property, as you need anyway to cast to ContentAwareMimeTypeService. That property would have been useful if we had extended the MimeTypeService API, but with two distinct service interfaces it's not really useful. How about having two references in the client (SlingWebDavServlet), a required MimeTypeService and an optional ContentAwareMimeTypeService? You can then use the latter if present, and the former if not. And get rid of the DETECTION_MODE, which also means no changes to the existing commons/mime module. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539433#comment-14539433 ] Carsten Ziegeler commented on SLING-4694: - I just had a brief look, but I noticed some things: - there is an empty activator - versions for exported packages are defined in the pom - metatype info is not in the component itself I guess the most interesting question is how to enable this? first of all when used in another bundle package imports should be optional, but I guess there should be an opt-out or opt-in mechanism even if that service is available Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539449#comment-14539449 ] Bertrand Delacretaz commented on SLING-4694: Thanks Carsten - do you have a suggestion for an opt-in or opt-out mechanism? My current idea is that both ContentAwareMimeTypeService and MimeTypeService might be available, and the client code decides (maybe based on a configuration parameter of that client code) which one to use. One might for example want to use ContentAwareMimeTypeService for some parts of their content tree, and the plain MimeTypeService for other parts. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539464#comment-14539464 ] Satya Deep Maheshwari commented on SLING-4694: -- Thanks for reviewing this. I did not add the new API to MimeTypeService as that addition would not have added any value to it unless it was itself made content aware and defeat the purpose of introducing ContentAwareMimeTypeService I think we might not need the MimeDetectionConstants.DETECTION_MODE Yes it is of no use right now as we have just 2 modes (Tika and Filename) and their respective interfaces. But I think this may become useful if we want to have the flexibility of multiple content detection mechanisms (other than Tika) . This setting provides a way to have multiple ContentAwareMimeTypeService implementations and the ability for a client to choose one. having two references in the client (SlingWebDavServlet) Yes this makes sense. Having DETECTION_MODE on the basic MimeTypeService is not needed. Will change this. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539484#comment-14539484 ] Satya Deep Maheshwari commented on SLING-4694: -- [~cziegeler], thanks for reviewing. I would address your observations from the review. I guess the most interesting question is how to enable this? Would it be any different from how the webdav servlet uses it in the proposed implementation? Can you please elaborate a bit on this? By opt-in/opt-out , do you mean that the service should have an option of indicating that it can handle the request or not? Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539284#comment-14539284 ] Satya Deep Maheshwari commented on SLING-4694: -- Here's the pull request for the proposed implementation: https://github.com/apache/sling/pull/89 [~bdelacretaz], please review. It includes unit tests for the newly added contentdetection bundle. I haven't yet add unit tests for the changes done in webdav and commons/mime bundles. Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539275#comment-14539275 ] ASF GitHub Bot commented on SLING-4694: --- GitHub user sdmcraft opened a pull request: https://github.com/apache/sling/pull/89 SLING-4694:Add ability to identify mime type based on file content * Initial checkin. Implements the basic idea. See https://issues.apache.org/jira/browse/SLING-4694 for details You can merge this pull request into a Git repository by running: $ git pull https://github.com/sdmcraft/sling SLING-4694 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/sling/pull/89.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #89 commit fefeec67c83cdd6e23b70784cf732ef692082bb4 Author: Satya Deep Maheshwari sat...@adobe.com Date: 2015-05-08T07:55:09Z SLING-4694:Add ability to identify mime type based on file content * Initial checkin. Implements the basic idea. See https://issues.apache.org/jira/browse/SLING-4694 for details Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4694) Add ability to identify mime type based on file content
[ https://issues.apache.org/jira/browse/SLING-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532543#comment-14532543 ] Satya Deep Maheshwari commented on SLING-4694: -- Best is probably to create a new commons/content-detection bundle If I create a separate bundle, I lose the ability to extend the existing concrete implementation of MimeTypeServiceImpl which is an internal class of commons/mime bundle. Either I can copy it over to the new bundle or make it abstract and export it from commons/mime. I would prefer to not do any of these. Is there another way to do this? Add ability to identify mime type based on file content --- Key: SLING-4694 URL: https://issues.apache.org/jira/browse/SLING-4694 Project: Sling Issue Type: Improvement Components: Servlets Affects Versions: JCR Webdav 2.2.2 Reporter: Satya Deep Maheshwari *Problem description:* I am facing a problem with the mime type detection of a file. While debugging, I see that SlingTikaDetector.detect method is used for detecting the mime type of my file. See [1]. This method just seems to rely on the name of the file for detecting its mime type. Even though its passed an inputstream of the file, it does not seem to use it for mime type detection. So if my file name is something like xyz.tmp, it detects its mime type as application/octet-stream (the default) while it may actually be a png file. This is a common scenario with webdav clients wherein temporary files get created with such names while being edited. *Suggested Solution:* Quoting [~rombert] {quote} Following the discussions at SLING-1059 [1] and SLING-255 [2] I can infer that we more or less opted out of the 'heavy-weight' approach of actually parsing the input stream. Not sure if we want to revisit that TBH. At any rate, our MimeTypeService does not have an API for getting the file content based on the input stream. I think though there's a way around it, but only at the code level. The org.apache.sling.jcr.webdav.impl.helper.SlingResourceConfig class hardcodes the Detector implementation to be a SlingTikaDetector. I think it is worthwile to raise a Jira issue for this and it would definitely expedite the fix if you're willing to submit a patch / pull request. I think it can be as simple as adding a @Reference to a Tika Detector to the SlingWebDavServlet and then passing that to the SlingServletConfig. Cheers, Robert [1]: https://issues.apache.org/jira/browse/SLING-1059 [2]: https://issues.apache.org/jira/browse/SLING-255 {quote} Related mailing-list thread on this: http://apache-sling.73963.n3.nabble.com/mime-type-detection-td4050586.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)