[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-12-31 Thread NSLW
https://bugs.kde.org/show_bug.cgi?id=371069

NSLW  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID
 CC||lukasz.wojnilow...@gmail.co
   ||m

--- Comment #11 from NSLW  ---
Closing based on comment #8 and me successfully opening provided file in master
version. Reopen if necessary.

-- 
You are receiving this mail because:
You are watching all bug changes.

[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-25 Thread Thomas Baumgart via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #10 from Thomas Baumgart  ---
>From what I can tell, the data looks good if I read it using the UTF-16 decoder
on the next page in the wizard. Nothing fancy or garbled.

And I just omitted the return type void when I referenced the method. It is a
current git head of the 4.8 branch.

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-22 Thread allan via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #9 from allan  ---
(In reply to Thomas Baumgart from comment #8)
> I tried this on my KDE4, KMyMoney 4.8 production system (this is generated
> of HEAD on the 4.8 branch).

> When I change the encoding in the dialog to UTF-16 before I select the file,
> then things seem to work properly.

Just to be clear, are you saying that you do not then see the problem I
reported, of the data being garbled?  If I ensure that UTF-16 is already
selected - displayed in the file selector - then I definitely do see the
corruption.

> I am looking at the following snippet in
> CSVDialog::readFile(const QString& fname):

Might this be a non-current git head version?  There has recently been some
reformatting of the source and I have void CSVWizard::readFile(const QString&
fname), as does the current git head.  In terms of the actual code, they are
identical in this particular area.  Just to be clear, again.

So far as the selector is concerned, then, yes, there are a couple of problems,
although if the decoding is for UTF-16, from previous activity or from setting
it prior to selecting the file, then the file should have the required
encoding.  Some tweaking looks to be necessary though.

Allan
> 
>   QFile  m_inFile(m_inFileName);
>   m_inFile.open(QIODevice::ReadOnly);  // allow a Carriage return - //
> QIODevice::Text
>   QTextStream inStream(_inFile);
>   QTextCodec* codec =
> QTextCodec::codecForMib(m_codecs.value(m_encodeIndex)->mibEnum());
>   inStream.setCodec(codec);
> 
>   QString buf = inStream.readAll();
> 
> When selecting UTF-16 before selecting your file, QString buf contained the
> correct data. I verified this in the debugger and also data displayed in
> spread sheet form seemd to be correct.
> 
> Hope that helps for further investigation.

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-22 Thread Thomas Baumgart via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #8 from Thomas Baumgart  ---
I tried this on my KDE4, KMyMoney 4.8 production system (this is generated of
HEAD on the 4.8 branch).

What is annoying, that once I select a file it automatically goes off. No way
to change parameters. One should be able to start the process the pressing the
OK button. This causes the UTF-16 data to display weird data due to the 0's
contained.

When I change the encoding in the dialog to UTF-16 before I select the file,
then things seem to work properly. I am looking at the following snippet in
CSVDialog::readFile(const QString& fname):

  QFile  m_inFile(m_inFileName);
  m_inFile.open(QIODevice::ReadOnly);  // allow a Carriage return - //
QIODevice::Text
  QTextStream inStream(_inFile);
  QTextCodec* codec =
QTextCodec::codecForMib(m_codecs.value(m_encodeIndex)->mibEnum());
  inStream.setCodec(codec);

  QString buf = inStream.readAll();

When selecting UTF-16 before selecting your file, QString buf contained the
correct data. I verified this in the debugger and also data displayed in spread
sheet form seemd to be correct.

Hope that helps for further investigation.

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-22 Thread allan via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #7 from allan  ---
[from Thomas]
"
Hi Allan,

you found out yourself: the BOM is not wrong, it's missing. I am sure, you 
stumbled over https://en.wikipedia.org/wiki/UTF-16.

I have not looked at the code of the CSV importer at that point, but you could 
check the beginning of the file (4 bytes) and see if the first two match a BOM 
or you find two 0x00 in those four bytes (where that would probably not work 
for Asian countries as they fill the upper byte with their characters).

Reading the data through a QTextStream allows to setup the encoding/decoding. 
Please take a look at QTextStream::setCodec() and setAutoDetectUnicode(), 
though according to the docs I have, the automatic detection should be the 
default. Maybe, you add another UI selector for the Encoding, in case you 
don't have it. See QTextCodec::availableCodecs() for a list of them. Check 
Kate/Kwrite in the Tools/Encding menu how this may look like.

So much for now. If you don't get the codec stuff going, please tell me where 
to find the relevant source code and I take a look at it.

Thomas"

[My reply]
"
I have /had -

QTextStream inStream();
QTextCodec* codec =
QTextCodec::codecForMib(m_codecs.value(m_encodeIndex)->mibEnum());
inStream.setCodec(codec);

QString buf = inStream.readAll();
...
(void CSVWizard::readFile(const QString& fname) line c843)
which I nicked from Qt, I think.

I have encoding selection in the file selector.
...
QPointer label = new QLabel(i18n("Encoding"));
  dialog->layout()->addWidget(label);
  //Add encoding selection to FileDialog
  QPointer comboBoxEncode = new QComboBox();
  setCodecList(m_codecs, comboBoxEncode);
  comboBoxEncode->setCurrentIndex(m_encodeIndex);
  connect(comboBoxEncode, SIGNAL(activated(int)), this,
SLOT(encodingChanged(int)));
  dialog->layout()->addWidget(comboBoxEncode);

(bool CSVWizard::getInFileName(QString& inFileName) line c798)

I don't see a setAutoDetectUnicode().

I don't think I had auto-selection, but encoding was by manual selection
from the list of codecs, but UTF-16 seems not to work. (in my code).

Allan"

I'm afraid I'm not able to commit to coding, still.

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-21 Thread Jack via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

Jack  changed:

   What|Removed |Added

 CC||ostroffjh@users.sourceforge
   ||.net

--- Comment #6 from Jack  ---
I think you might be able to use dos2unix to modify the file into a usable
format, or at least confirm info about the encoding and BOM.  It may take a
while to wade through all them options and parameters.

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-21 Thread allan via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #5 from allan  ---
Thomas suggested using Okteta to look at the data to see if the BOM was
correct.
Here are the first few lines :-
"
   22 00 4D 00  52 00 20 00  41 00 4C 00  ".M.R. .A.L.
000C   4C 00 41 00  4E 00 20 00  41 00 4E 00  L.A.N. .A.N.
0018   44 00 45 00  52 00 53 00  4F 00 4E 00  D.E.R.S.O.N. "

So, no BOM, just the data, and still mis-formatted by the plugin.

Ah, I've just checked against the Libre Office Calc file and, just looking at
the beginning, the bad one has "22 00", and the good one has "FF FE".  So, the
BOM is wrong on the bank version.

Allan

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-21 Thread allan via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #4 from allan  ---
(In reply to allan from comment #3)
> It looks like my attempts to provide an edited sample file either produce
> rubbish, or remove whatever causes the problem.
> So, I may have to provide a complete file, but I don't wish to broadcast it,
> so would like to send it off-line.  To whom?
> 
> Allan

I've sent a gpg'ed copy to Thomas.

Allan

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-18 Thread allan via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #3 from allan  ---
It looks like my attempts to provide an edited sample file either produce
rubbish, or remove whatever causes the problem.
So, I may have to provide a complete file, but I don't wish to broadcast it, so
would like to send it off-line.  To whom?

Allan

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-18 Thread allan via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

allan  changed:

   What|Removed |Added

 Attachment #101617|0   |1
is obsolete||

--- Comment #2 from allan  ---
Created attachment 101618
  --> https://bugs.kde.org/attachment.cgi?id=101618=edit
UTF-16 file

-- 
You are receiving this mail because:
You are watching all bug changes.


[kmymoney4] [Bug 371069] CSV plugin mishandles UTF-16 files

2016-10-18 Thread allan via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=371069

--- Comment #1 from allan  ---
Created attachment 101617
  --> https://bugs.kde.org/attachment.cgi?id=101617=edit
UTF-16 file

-- 
You are receiving this mail because:
You are watching all bug changes.