D14674: handle non-ASCII encodings of file names in tar archives

2018-08-29 Thread Rinat Ibragimov
ibragimovrinat added a comment.


  @aacid, I've just sent file to the e-mail. I also attached the same file to 
this message: F6223361: tar_non_ascii_file_name.tar.gz 


REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: aacid, xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, 
michaelh, ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-28 Thread Albert Astals Cid
aacid added a comment.


  @ibragimovrinat Actually can you please send me that tarball file to 
aa...@kde.org ? I don't seem to be able to recreate that from the diff file you 
attached.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: aacid, xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, 
michaelh, ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-28 Thread Albert Astals Cid
aacid added a comment.


  @ibragimovrinat ah my fault, arc didn't create the tar.gz file 
correctly from the diff

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: aacid, xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, 
michaelh, ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-28 Thread Albert Astals Cid
aacid added a comment.


  @ibragimovrinat KArchiveTest::testTarShortNonASCIINames is failing both in CI 
[1] and for me locally
  
  23:54:20 FAIL!  : KArchiveTest::testTarShortNonASCIINames() Compared values 
are not the same
  23:54:20Actual   (listing.count()): 0
  23:54:20Expected (1)  : 1
  23:54:20Loc: [/home/jenkins/workspace/Frameworks karchive kf5-qt5 
SUSEQt5.9/autotests/karchivetest.cpp(815)]
  
  Can you have a look please?
  
  [1] 
https://build.kde.org/job/Frameworks%20karchive%20kf5-qt5%20SUSEQt5.9/30/console

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: aacid, xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, 
michaelh, ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-28 Thread Albert Astals Cid
aacid closed this revision.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-28 Thread Friedrich W. H. Kossebau
kossebau added a comment.


  Someone still needs to land this in the karchive repo, no? :) @ibragimovrinat 
I assume this is your first contribution, so you cannot yet push commits 
yourself, right? For that we will need an email address for the authorship of 
this patch in the commit, next to your name "Rinat Ibragimov". Which address 
can we use?
  
  (Only saw this while doing house-keeoing on my phab list, as before not 
enough clue about katchive currently, so needs someone else to land this, some 
of the reviewers should be able.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-13 Thread Rinat Ibragimov
ibragimovrinat added a comment.


  Thanks!

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-13 Thread David Faure
dfaure accepted this revision.
dfaure added a comment.
This revision is now accepted and ready to land.


  Excellent, thanks for the contribution.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-12 Thread Rinat Ibragimov
ibragimovrinat marked 2 inline comments as done.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-12 Thread Rinat Ibragimov
ibragimovrinat updated this revision to Diff 39537.
ibragimovrinat added a comment.


  Updated diff to use QString everywhere instead of QStringLiteral and 
QLatin1String. As those are in tests, performance penalty shouldn't matter. 
Also using QCOMPARE for comparisons.

REPOSITORY
  R243 KArchive

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D14674?vs=39480=39537

REVISION DETAIL
  https://phabricator.kde.org/D14674

AFFECTED FILES
  autotests/karchivetest.cpp
  autotests/karchivetest.h
  autotests/tar_non_ascii_file_name.tar.gz
  src/ktar.cpp

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-12 Thread David Faure
dfaure requested changes to this revision.
dfaure added a comment.
This revision now requires changes to proceed.


  Great job, thanks for the unittest.
  
  Just a few minor adjustments and this is good to go in.

INLINE COMMENTS

> karchivetest.cpp:776
> +const QString longName =
> +QStringLiteral("раз-два-три-четыре-пять-вышел-зайчик-погулять-вдруг-"
> +   "охотник-выбегает-прямо-в-зайчика.txt");

This should be QString::fromUtf8(), because QStringLiteral doesn't support 
UTF-8 on Windows. I *think* it also doesn't support multiline literals, so make 
it a single line.

Alternatively, just use QString("...") like you do further below, that supports 
utf8 and multiline literals.

> karchivetest.cpp:798
> +
> +QVERIFY(listing.count() == 1);
> +

QCOMPARE(listing.count(), 1);

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-11 Thread Rinat Ibragimov
ibragimovrinat updated this revision to Diff 39480.
ibragimovrinat added a comment.


  Added couple of tests. One tests that file name's not truncated, and the 
other tests that checksum is calculated in a correct way.

REPOSITORY
  R243 KArchive

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D14674?vs=39258=39480

REVISION DETAIL
  https://phabricator.kde.org/D14674

AFFECTED FILES
  autotests/karchivetest.cpp
  autotests/karchivetest.h
  autotests/tar_non_ascii_file_name.tar.gz
  src/ktar.cpp

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-09 Thread Rinat Ibragimov
ibragimovrinat added inline comments.

INLINE COMMENTS

> cfeck wrote in ktar.cpp:777
> Does the spec say `> 99` or `> 100`? The comment and the code should match.

The comment is right in some sense, 100 bytes means 99 bytes of name and 1 byte 
of '\0'.

I don't know for sure if there is a (single) spec actually. At least GNU tar 
(http://www.gnu.org/software/tar/manual/html_node/Standard.html) mentions that 
although name[100] is a 100-byte field, "The name, linkname, magic, uname, and 
gname are null-terminated character strings." So, there could be string of at 
most 99 bytes, plus a zero-terminator. However, there is a special case in this 
source file where 100-byte but non-zero-terminated names are handled (search 
for "bug:101472" string in the source). Maybe, there are such tar files in the 
wild.

How should I change the comment?

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-08 Thread Julian Schraner
xyquadrat added a comment.


  Two links that might be useful when writing the unit test:
  https://community.kde.org/Guidelines_and_HOWTOs/UnitTests
  https://wiki.qt.io/Writing_Unit_Tests
  Also, there are already a few unit tests in this repo that you could take a 
look at.
  
  Thank you for your patch!

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: xyquadrat, broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, 
ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-08 Thread David Faure
dfaure added a comment.


  Looks good, but indeed a unittest addition would be very welcome.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, ngraham, 
bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-07 Thread Kai Uwe Broulik
broulik added a comment.


  Sorry, I just added you as I looked at who committed to that repo most, to 
ensure this review doesn't go unnoticed for the lack of reviewers :)

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, ngraham, 
bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-07 Thread Friedrich W. H. Kossebau
kossebau added a subscriber: broulik.
kossebau added a comment.


  Thanks for proposing a patch. for the bug  @ibragimovrinat
  Myself I only contributed once in former times to karchive, all memories 
lost, not sure if I have time to dive into that codebase again, despite 
@broulik here poking me :)
  
  One thing though for sure which would be wanted here is hardening the code by 
having some unit tests covering ascii-only and non-ascii file names. So whoever 
might review this patch surely would like to see some samples tested somewhere 
below autotests/ :)

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: broulik, cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, ngraham, 
bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-07 Thread Christoph Feck
cfeck added inline comments.

INLINE COMMENTS

> ktar.cpp:777
> +// If more than 100 bytes, we need to use the LongLink trick
> +if (encodedFileName.length() > 99) {
>  d->writeLonglink(buffer, encodedFileName, 'L', uname.constData(), 
> gname.constData());

Does the spec say `> 99` or `> 100`? The comment and the code should match.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: cfeck, ibragimovrinat, kde-frameworks-devel, michaelh, ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-07 Thread Kai Uwe Broulik
broulik added reviewers: dfaure, kossebau.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

To: ibragimovrinat, dfaure, kossebau
Cc: ibragimovrinat, kde-frameworks-devel, michaelh, ngraham, bruns


D14674: handle non-ASCII encodings of file names in tar archives

2018-08-07 Thread Rinat Ibragimov
ibragimovrinat created this revision.
Restricted Application added a project: Frameworks.
Restricted Application added a subscriber: kde-frameworks-devel.
ibragimovrinat requested review of this revision.

REVISION SUMMARY
  BUG: 266141
  
  As UTF-8 may use multiple bytes per character, it's required to measure 
encoded file name length in bytes, not just in characters. It's possible to 
accidentally truncate file name if its name in characters is shorter that 100, 
but in bytes — longer than 100. Also, checksum calculation must be performed on 
unsigned bytes. Otherwise bytes in range 0x80-0xff when promoted to int become 
negative.

REPOSITORY
  R243 KArchive

REVISION DETAIL
  https://phabricator.kde.org/D14674

AFFECTED FILES
  src/ktar.cpp

To: ibragimovrinat
Cc: ibragimovrinat, kde-frameworks-devel, michaelh, ngraham, bruns