Re: d2sqlite3 db.run, where lies the bug?

2018-04-10 Thread Ralph Amissah via Digitalmars-d

ag0aep6g, thank you most of all for fixing the bug, the removal of my immediate
frustration! Also for the thorough step-wise instruction on sleuthing and fixing
it.

>
> If you want, you can make a bug report or a pull request with the fix.
> Otherwise, if you're not up to that, I can make one.
>

Generous, I merely stumbled on it and shouted out, you fixed it. I would be
grateful if you please file the bug and your fix. (I thought I might have more
to report, but your fix has resolved my multitude of issues with several texts).
I did copy my original post to the author of d2sqlite3 (and again this reply
which includes your fix), so he hopefully has the information required to make
the upstream fix. The diff/patch I used:

diff -Naur d2sqlite3-0.16.0/d2sqlite3/source/d2sqlite3/internal/util.d 
d2sqlite3-0.16.0-1/d2sqlite3/source/d2sqlite3/internal/util.d
--- d2sqlite3-0.16.0/d2sqlite3/source/d2sqlite3/internal/util.d 2018-04-10 
20:34:11.584498926 -0400
+++ d2sqlite3-0.16.0-1/d2sqlite3/source/d2sqlite3/internal/util.d   
2018-04-10 20:46:09.869812899 -0400
@@ -65,13 +65,14 @@
 {
 import std.algorithm : countUntil;
 import std.string : toStringz;
+import std.utf: byCodeUnit;

 size_t pos;
 bool complete;
 do
 {
 auto tail = sql[pos .. $];
-immutable offset = tail.countUntil(';') + 1;
+immutable offset = tail.byCodeUnit.countUntil(';') + 1;
 pos += offset;
 if (offset == 0)
 pos = sql.length;

(if patch is contained in patchfile called d2sqlite3_bugfix.diff and at the same
directory level as subdirectory containing d2sqlite3-0.16.0)
cp -av d2sqlite3-0.16.0 d2sqlite3-0.16.0-1 \
&& cd d2sqlite3-0.16.0-1 \
&& patch -Np1 < ../d2sqlite3_bugfix.diff

On Tue, Apr 10 2018, ag0aep6g via Digitalmars-d  
wrote:

> On 04/10/2018 08:04 PM, Ralph Amissah wrote:
>> The exact location of problem may be provided in the error statement
>> "core.exception.UnicodeException@src/rt/util/utf.d(292): invalid
>> UTF-8 sequence".
>>
> [...]
>> Mock problem string with test code follows (d2sqlite3 required):
>>
> [... code ...]

[... snip ...]

>
>  From the exception's stack trace we see that
> `d2sqlite3.internal.util.byStatement(immutable(char)[]).ByStatement.findEnd`
> is the deepest non-Phobos function involved. So that's a good first spot
> to look for a bug. Let's check it out.
>
> https://github.com/biozic/d2sqlite3/blob/2e8211946ae0e09646d561aeae1361a695adcc17/source/d2sqlite3/internal/util.d#L64-L83
>
> And indeed, there's a bug in these lines:
>
> 
> auto tail = sql[pos .. $];
> immutable offset = tail.countUntil(';') + 1;
> pos += offset;
> 
>
> `pos` is used to slice the string `sql`. That means, `pos` is
> interpreted as a number of UTF-8 code *units*. But then the result of
> `countUntil` is added. `countUntil` counts code *points*. So a number of
> code points is mistaken as a number of code units. That means the next
> slicing can be incorrect and split up a multibyte sequence. And then
> `countUntil` will complain about broken UTF-8.
>
> This can be fixed by letting `countUntil` operate on count code units
> instead:
>
> 
> import std.utf: byCodeUnit;
> immutable offset = tail.byCodeUnit.countUntil(';') + 1;
> 
>
> If you want, you can make a bug report or a pull request with the fix.
> Otherwise, if you're not up to that, I can make one.
>
> [...]
>>- DMD64 D Compiler v2.074.1
>
> That's rather old. I'd recommend updating if possible.

for the time being it reflects the status of the rolling development branch of
Debian. LDC is based on a newer version.

Thanks again.
Ralph Amissah


d2sqlite3 db.run, where lies the bug?

2018-04-10 Thread Ralph Amissah via Digitalmars-d
/+
Not sure where to report this, nor of where the bug lies. I hope
SQLite (and d2sqlite3) is used widely enough for this to be of
interest here.

I have sets of document files that are broken up and placed
(inserted) into an sqlite3 db, some of which fail with what is to me
an inexplicable utf-8 error as they contain no special characters
and it is "corrected" without the removal of any character in
particular, and I so far, cannot predict them. Below I have
concocted a sample string that fails and variations of it that pass
(with one character removed or added).

Note,
(i) There is no problem inserting the string in question into
  sqlite3 directly (i.e. using sqlite3 directly).
(ii) Likewise there is no problem using d2sqlite3 with a prepared
  statement on the string (.execute), but,... But for the use case,
  each document has several thousand content strings for each of
  which sqlite3 then requires a data locks and releases for the
  insertion of the next, thousands per document and this makes the
  operation conservatively several tens of time slower, than:
  generating a prepared sql statement inserting all document rows
  (objects/paragraphs) as a single sql statement with db.run begin
  and commit. Basically, it makes a significant difference that this
  works.
(iii) There do not appear to be any offending utf-8 characters

Assumption, most likely candidate for blame is the d2sqlite3 wrapper
(or my code, (something that needs to be escaped in certain
circumstances? please tell me)); sqlite3 does not have a problem
with the string in question, and the C api for which d2sqlite3
provides a wrapper is much used and seems unlikely to be to blame.

The exact location of problem may be provided in the error statement
"core.exception.UnicodeException@src/rt/util/utf.d(292): invalid
UTF-8 sequence".

>From a laymans perspecive it would appear that db.run the passing of
an sql statement as a string to sqlite3 should be the simplest type
of transaction; pass a statement/string unchanged to sqlite3

Sample offending text used:
"Contrary to Peter’s cake; all versions of Peter’s, custard. Eating without 
the cook’s permission is, of course, naughty. Now, a quick check of my menu 
shows that sure enough, a number of files have Peter’s recipe in them. John had 
better tell us if he ate Peter’s lunch. You didn’t eat Peter’s berries; funny.",

The prepared sql statement that fails with d2sqlite3 (but succeeds
using sqlite3 directly):

BEGIN;
DROP TABLE IF EXISTS test;
CREATE TABLE test (
  lid BIGINT PRIMARY KEY,
  txt TEXT NULL
);
INSERT INTO test (txt) VALUES ('Contrary to Peter’s cake; all versions 
of Peter’s, custard. Eating without the cook’s permission is, of course, 
naughty. Now, a quick check of my menu shows that sure enough, a number of 
files have Peter’s recipe in them. John had better tell us if he ate Peter’s 
lunch. You didn’t eat Peter’s berries; funny.');
COMMIT;

core.exception.UnicodeException@rt/util/utf.d(292): invalid UTF-8 sequence

in the sample problem text/string/statement for d2sqlite3 (above):
  - utf-8 error goes away on removal of either:
- one of the semi-colons (of which there are 2)
- any of the closing single quote mark symbols ’ occurring between the two 
semi-colons (of which there are 6)
  - comments
- placing a random semi-colon within the two offending semi-colons 
sometimes works
- double semi-colon does not help

Mock problem string with test code follows (d2sqlite3 required):

+/

module d2sqlite3_utf8.issue;
import std.conv : to;
import std.format;
import std.stdio;
import d2sqlite3;
void main() {
  string[] info_tag = ["pass", "fault"];
  foreach (t, tests; [ok_check, faults]) {
foreach (i, insert_test; tests) {
  // auto db = Database("test.sqlite");
  auto db = Database(":memory:");
  string _sql_statement = format(q"¶
BEGIN;
DROP TABLE IF EXISTS test;
CREATE TABLE test (
  lid BIGINT PRIMARY KEY,
  txt TEXT NULL
);
INSERT INTO test (txt) VALUES ('%s');
COMMIT;
  ¶",
insert_test[0],
  );
  writeln(
(i + 1), ". ", info_tag[t], ": ", insert_test[1],
"\n", _sql_statement);
  db.run(_sql_statement);
  foreach (parse; db.execute("SELECT txt FROM test;")) {
foreach (s; parse) {
  writeln(
"  (test string == sqlite database content): ",
(insert_test[0] == s.to!string), "\n");
  assert(insert_test[0] == s.to!string);
  // writeln(s);
}
  }
  assert(db.totalChanges == 1);
  db.close;
}
  }
}
string[][] faults = [
  [
"Contrary to Peter’s cake; all versions of Peter’s, custard. Eating without 
the cook’s permission is, of course, naughty. Now, a quick check of my menu 
shows that sure enough, a number of files have Peter’s recipe in them. John had 
better tell us if he ate 

dmd debian installation conflicts with debian-goodies

2017-06-28 Thread Ralph Amissah via Digitalmars-d
Installing dmd if debian-goodies is installed fails. Both try to write a file
named '/usr/bin/dman'

Debian Stretch is out, the freeze is over, perhaps now dmd will soon be
available as a package in Debian? Ldc2 does a great job but for testing
purposes and convenience it would be good to have the reference compiler.
Is there likely to be D related activity at DebCamp and DebConf 2017, Montreal?

* failed attempt to install dmd with debian-goodies installed

sudo dpkg -i dmd_2.074.1-0_amd64.1.deb
(Reading database ... 224610 files and directories currently installed.)
Preparing to unpack dmd_2.074.1-0_amd64.1.deb ...
Unpacking dmd (2.074.1-0) ...
dpkg: error processing archive dmd_2.074.1-0_amd64.1.deb (--install):
 trying to overwrite '/usr/bin/dman', which is also in package debian-goodies 
0.74
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)

sudo dpkg -i dmd_2.075.0~b1-0_amd64.deb
(Reading database ... 224610 files and directories currently installed.)
Preparing to unpack dmd_2.075.0~b1-0_amd64.deb ...
Unpacking dmd (2.075.0~b1-0) ...
dpkg: error processing archive dmd_2.075.0~b1-0_amd64.deb (--install):
 trying to overwrite '/usr/bin/dman', which is also in package debian-goodies 
0.74
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)

* debian-goodies

information on debian-goodies (the package dmd conflicts with):

apt show debian-goodies
Package: debian-goodies
Version: 0.74
Priority: optional
Section: utils
Maintainer: Javier Fernández-Sanguino Peña 
Installed-Size: 199 kB
Depends: curl, dctrl-tools | grep-dctrl, perl, python3, whiptail | dialog
Recommends: lsof
Suggests: lsb-release, popularity-contest, xdg-utils, zenity
Conflicts: debget
Replaces: debget
Tag: implemented-in::python, interface::commandline, role::program,
 scope::utility, suite::debian, use::searching, works-with::bugs,
 works-with::software:package
Download-Size: 73.4 kB
APT-Manual-Installed: yes
APT-Sources: http://ftp.ch.debian.org/debian unstable/main amd64 Packages
Description: Small toolbox-style utilities for Debian systems
 These programs are designed to integrate with standard shell tools,
 extending them to operate on the Debian packaging system.
 .
  dgrep  - Search all files in specified packages for a regex
  dglob  - Generate a list of package names which match a pattern
 .
 These are also included, because they are useful and don't justify
 their own packages:
 .
  debget - Fetch a .deb for a package in APT's database
  dpigs  - Show which installed packages occupy the most space
  debman - Easily view man pages from a binary .deb without
   extracting
  debmany- Select manpages of installed or uninstalled packages
  dman   - Fetch manpages from online manpages.debian.org service
  checkrestart   - Help to find and restart processes which are using old
   versions of upgraded files (such as libraries)
  popbugs- Display a customized release-critical bug list based
   on packages you use (using popularity-contest data)
  which-pkg-broke- find which package might have broken another
  check-enhancements - find packages which enhance installed packages


Re: Makefile experts, unite!

2017-06-13 Thread Ralph Amissah via Digitalmars-d
On Sun, Jun 11 2017, Jonathan M Davis via Digitalmars-d 
 wrote:

And where do existing D build systems fit into the picture, I seem to
recall this general discussion around the announcement of "button", which
seems interesting, potentially awesome even:

Button announcement
https://forum.dlang.org/post/uhozcvatvyztfuhiv...@forum.dlang.org

Features
https://code.dlang.org/packages/button
- Implicit dependency detection.
- Correct incremental builds.
- Can display a graph of the build.
- Recursive. Can generate a build description as part of the build.
- Very general. Does not make any assumptions about the structure of your 
project.
- Detects and displays cyclic dependencies.
- Detects race conditions.

example build description provided (the build description is done in
lua)
https://github.com/jasonwhite/dmd/blob/button/src/BUILD.lua

http://jasonwhite.github.io/button/

I have not used it, but if it works as described on the box most should
be pleased to get to know and to use and promote, no?... I assume it
will work well with dub if necessary? Other D based build systems have
been mentioned, who is able to review them?

> On Sunday, June 11, 2017 16:47:30 Ali Çehreli via Digitalmars-d wrote:
>> On 06/11/2017 12:27 PM, ketmar wrote:
>>  > p.s.: or replacing make-based build system with D-based. as we need
>>  > working D compiler to compile dmd anyway, i see no reason to not use it
>>  > more.
>>
>> I had the pleasure of working with Eyal Lotem, main author of buildsome.
>> The buildsome team are aware of all pitfalls of all build systems and
>> offer build*some* as an awe*some* ;) and correct build system:
>>
>>http://buildsome.github.io/buildsome/
>
> Atila did some work to get dmd, druntime, phobos, etc. building with
> reggae/dub, and I _think_ that he had it all working. As I understand it,
> the main barrier to switching to it officially was political. A number of us
> would _love_ to see the makefiles killed off, and there's really no
> technical barrier to doing so. It's really a question of convincing folks
> like Walter, Andrei, and Martin, and I get the impression that to an extent,
> there's an attitude of not wanting to mess with what's working (though I
> dispute that it works all that well from a maintenance perspective).
>
> It's certainly a pain to edit the makefiles though, and I think that we'd be
> far better off in the long run if we switched to something like reggae - and
> since reggae is written in D and uses dub, we'd be dogfooding our own stuff
> in the process.
>
> - Jonathan M Davis
>
>

--
ralph.amis...@gmail.com