Re: Porting uses of 'accuracy' in KMimeType API

2014-10-12 Thread David Faure
On Friday 12 September 2014 21:06:36 Kevin Funk wrote:
 On Friday 12 September 2014 10:50:36 David Faure wrote:
  On Friday 12 September 2014 09:39:42 Kevin Funk wrote:
   Heya,
   
   Context: Forward-porting some patches in KDevelop involving KMimeType
   API.
   
   I've just noticed that in Qt5, QMimeDataBase/QMimeType doesn't allow me
   to
   retrieve the accuracy of a match anymore, while KMimeType did. For
   example,
   compare the possible arguments for QMimeDataBase::mimeTypeForUrl vs.
   KMimeType::findByUrl.
   
   What's the suggested way to deal with this? As far as I can see, there
   are
   no porting notes about that particular matter.
  
  You're telling me everything I know already, but not the important bit
  which is: what do you need the accuracy for?
 
 Well, this is ancient, performance-critical code inside KDevelop. We
 basically cache a extension - language mapping for performance reasons.
 Just recently we introduced code that avoids caching extensions for which
 the KMimeType lookup yielded a very low accuracy.
 
 Also see: https://git.reviewboard.kde.org/r/120085/

OK, I see.
I confirm that putting /* at the top of a text file without extension (or 
without a known extension) leads to text/x-csrc.

To avoid this, you could decide to only trust extensions.
In an IDE, I'd say that's probably valid.
i.e. I would do
db.mimeTypeForFile(fileName, QMimeDatabase::MatchExtension);

(I was about to suggest to compare the result of MatchExtension
and MatchContent, but that wouldn't make sense. If there's an extension then 
you want to use that, and only if there isn't, then MatchContent can be 
useful, but leads to exactly what you don't want: inaccurate results).

MatchContent is more useful in e.g. an image viewer than in a C++ IDE,
because image formats are much more strictly defined and easy to recognize. 

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Porting uses of 'accuracy' in KMimeType API

2014-09-13 Thread Milian Wolff
On Friday 12 September 2014 21:06:36 Kevin Funk wrote:
 On Friday 12 September 2014 10:50:36 David Faure wrote:
  On Friday 12 September 2014 09:39:42 Kevin Funk wrote:
   Heya,
   
   Context: Forward-porting some patches in KDevelop involving KMimeType
   API.
   
   I've just noticed that in Qt5, QMimeDataBase/QMimeType doesn't allow me
   to
   retrieve the accuracy of a match anymore, while KMimeType did. For
   example,
   compare the possible arguments for QMimeDataBase::mimeTypeForUrl vs.
   KMimeType::findByUrl.
   
   What's the suggested way to deal with this? As far as I can see, there
   are
   no porting notes about that particular matter.
  
  You're telling me everything I know already, but not the important bit
  which is: what do you need the accuracy for?
 
 Well, this is ancient, performance-critical code inside KDevelop. We
 basically cache a extension - language mapping for performance reasons.
 Just recently we introduced code that avoids caching extensions for which
 the KMimeType lookup yielded a very low accuracy.
 
 Also see: https://git.reviewboard.kde.org/r/120085/

With Qt5, we need to reevalute the performance of the mimetype checking. Maybe 
it's good enough nowadays. Otherwise I agree that we'll need some information 
on the accuracy of the match.

Bye
-- 
Milian Wolff
m...@milianw.de
http://milianw.de
___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Porting uses of 'accuracy' in KMimeType API

2014-09-12 Thread Kevin Funk
Heya,

Context: Forward-porting some patches in KDevelop involving KMimeType API.

I've just noticed that in Qt5, QMimeDataBase/QMimeType doesn't allow me to 
retrieve the accuracy of a match anymore, while KMimeType did. For example, 
compare the possible arguments for QMimeDataBase::mimeTypeForUrl vs. 
KMimeType::findByUrl.

What's the suggested way to deal with this? As far as I can see, there are no 
porting notes about that particular matter.

(Interestingly, the QMimeDataBase implementation internally plays around with 
accuracies a lot, it just isn't exposed in the public API)

Thanks

[1] 
http://api.kde.org/frameworks-api/frameworks5-apidocs/kdelibs4support/html/classKMimeType.html#a3417e83a30cff614a01a29ca2a615443

-- 
Kevin Funk | kf...@kde.org | http://kfunk.org
___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Porting uses of 'accuracy' in KMimeType API

2014-09-12 Thread David Faure
On Friday 12 September 2014 09:39:42 Kevin Funk wrote:
 Heya,
 
 Context: Forward-porting some patches in KDevelop involving KMimeType API.
 
 I've just noticed that in Qt5, QMimeDataBase/QMimeType doesn't allow me to
 retrieve the accuracy of a match anymore, while KMimeType did. For example,
 compare the possible arguments for QMimeDataBase::mimeTypeForUrl vs.
 KMimeType::findByUrl.
 
 What's the suggested way to deal with this? As far as I can see, there are
 no porting notes about that particular matter.

You're telling me everything I know already, but not the important bit which 
is: what do you need the accuracy for?

The mime matching tries its best to find out the mimetype based on what you 
give it (filename and/or content). The output of that is the best idea we have 
about the mimetype. What difference does it make if that's a 20% or an 80% 
accuracy?

 (Interestingly, the QMimeDataBase implementation internally plays around
 with accuracies a lot, it just isn't exposed in the public API)

Yes, as per the mimetype spec.

My guess is that it was used as a workaround for bad api or implementation, 
like try with the content, and if that's not accurate, try with the 
filename, or vice-versa. But QMimeDatabase has the right all-in-one methods 
for this, following the spec, i.e. if both filename and content are 
available, then trust the filename, unless there are multiple mimetypes 
claiming the same glob, then look at content, and select the one that matches 
(or is a subclass of) one of the candidates based on the filename.

The intended way to use this is to use isDefault() on the QMimeType,
if that's true then it couldn't find any specific mimetype for the file
(i.e. basically accuracy 0). And otherwise, that's the mimetype.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Porting uses of 'accuracy' in KMimeType API

2014-09-12 Thread Kevin Funk
On Friday 12 September 2014 10:50:36 David Faure wrote:
 On Friday 12 September 2014 09:39:42 Kevin Funk wrote:
  Heya,
  
  Context: Forward-porting some patches in KDevelop involving KMimeType API.
  
  I've just noticed that in Qt5, QMimeDataBase/QMimeType doesn't allow me to
  retrieve the accuracy of a match anymore, while KMimeType did. For
  example,
  compare the possible arguments for QMimeDataBase::mimeTypeForUrl vs.
  KMimeType::findByUrl.
  
  What's the suggested way to deal with this? As far as I can see, there are
  no porting notes about that particular matter.
 
 You're telling me everything I know already, but not the important bit which
 is: what do you need the accuracy for?

Well, this is ancient, performance-critical code inside KDevelop. We basically 
cache a extension - language mapping for performance reasons. Just recently 
we introduced code that avoids caching extensions for which the KMimeType 
lookup yielded a very low accuracy.

Also see: https://git.reviewboard.kde.org/r/120085/

Note: I never touched that code myself.

 The mime matching tries its best to find out the mimetype based on what you
 give it (filename and/or content). The output of that is the best idea we
 have about the mimetype. What difference does it make if that's a 20% or an
 80% accuracy?
 
  (Interestingly, the QMimeDataBase implementation internally plays around
  with accuracies a lot, it just isn't exposed in the public API)
 
 Yes, as per the mimetype spec.
 
 My guess is that it was used as a workaround for bad api or implementation,
 like try with the content, and if that's not accurate, try with the
 filename, or vice-versa. But QMimeDatabase has the right all-in-one methods
 for this, following the spec, i.e. if both filename and content are
 available, then trust the filename, unless there are multiple mimetypes
 claiming the same glob, then look at content, and select the one that
 matches (or is a subclass of) one of the candidates based on the filename.
 
 The intended way to use this is to use isDefault() on the QMimeType,
 if that's true then it couldn't find any specific mimetype for the file
 (i.e. basically accuracy 0). And otherwise, that's the mimetype.

-- 
Kevin Funk | kf...@kde.org | http://kfunk.org
___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel