Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "ContentMimeDetection" page has been changed by Lukeliush: https://wiki.apache.org/tika/ContentMimeDetection?action=diff&rev1=6&rev2=7 https://issues.apache.org/jira/browse/TIKA-1582 - Motivation: + '''Motivation''' @@ -24, +24 @@ This approach could also enhance identification safety, so it only trusts the file with the type which has the similar byte histogram pattern it has seen in the training, this has pros and cons, the pros is that it enhance the security aspect of the file type identification, but the cons is slow detection which requires the reading the entire bytes of a file for computing the byte histogram and it might be also myopic to the training data which might be less representative. - Methods: + '''Methods''' As mentioned, the content-based mime detection follows a standard data mining process:
