Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-25 Thread coldtobi
Package: libtokyocabinet9
Followup-For: Bug #667979

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tagging this wontfix / help-needed, as currently I have no feasible way to 
resolve this, but to heavily patch tokyocabinet.

Of course, I'm open to other approaches which we might have missed...

coldtobi

- -- System Information:
Debian Release: wheezy/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.utf8, LC_CTYPE=de_DE.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages libtokyocabinet9 depends on:
ii  libbz2-1.0  1.0.6-1
ii  libc6   2.13-30
ii  zlib1g  1:1.2.6.dfsg-2

libtokyocabinet9 recommends no packages.

libtokyocabinet9 suggests no packages.

- -- no debconf information

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAk+Xq3gACgkQvyUNygvkuQKARwCfcQiQFCLgtqQ92y8ip7bSOiSP
ja4AoKy2Ptrn7eU2LwgcY/n+iO20uO/p
=iTyt
-END PGP SIGNATURE-



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-11 Thread Tobias Frost
Well...
I'm still waiting for the answer from upstream, but if you are right
(and I fear you are) changing the behaviour would break other
applications, and I do not believe that this would be wise.

Moreless we would need an migration path, like patching tokyocabinet to
automatically detect endianess and act accordingly (whatever this is;
for sure this is not a trivial thing )

So lets wait for an answer from upstream before looking into detailed
options, but currently I cannot rule out to tag this won't fix or at
least help-needed.  

Tobias



 

Am Montag, den 09.04.2012, 22:09 +0200 schrieb Mikhail Gusarov:
 Hi.
 
 Upstream Tokyo Cabinet uses little-endian data in databases and hence
 the databases are portable.
 
 After disabling the switch database format will not change for
 big-endian machines, but *will* for little-endian ones, so existing
 databases on little-endian machines will become unreadable and opening
 those will fail with invalid metadata error.
 
 BTW, I stand corrected in the original bug report: on big-endian
 machines database format is correct.
 




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-11 Thread Mikhail Gusarov
Tobias,

Twas brillig at 08:28:53 11.04.2012 UTC+02 when t...@frost.de did gyre and 
gimble:

 TF I'm still waiting for the answer from upstream, but if you are
 TF right (and I fear you are) changing the behaviour would break other
 TF applications, and I do not believe that this would be wise.

At least mutt re-creates TC database on demand, that's what I checked.

 TF Moreless we would need an migration path, like patching tokyocabinet to
 TF automatically detect endianess and act accordingly (whatever this is;
 TF for sure this is not a trivial thing )

Not a trivial change, indeed.

 TF So lets wait for an answer from upstream before looking into detailed
 TF options, but currently I cannot rule out to tag this won't fix or at
 TF least help-needed.  

Upstream does not care about TokyoCabinet anymore, it's KyotoCabinet
nowadays. Sad, but true.

-- 


pgpC9w3abTCQo.pgp
Description: PGP signature


Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-09 Thread Mikhail Gusarov
Hi.

I would revert to upstream's default and use native endianness.

Having a endian-neutral databases is be nice, but given it is not a
upstream's goal anyway, I'd stick to upstream and avoid introducing
incompatibilities.

-- 


pgp4wg5sz66EM.pgp
Description: PGP signature


Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-09 Thread Tobias Frost
Package: libtokyocabinet9
Followup-For: Bug #667979

Updaea after some research on this topic. In particular I was puzzled if the
change on the endianess breaks exsitings database files, but this seems not the
case, as http:/fallabs.com/tokyocabinet/spex-en.html tells

quote
Because database files are not sparse, you can copy them as with normal files. 
Moreover, the database formats don't depend on the byte order of the running 
environment, you can migrate the database files between environments with 
different byte orders.
/quote

So:
- Portability of the database fiels is not affected by either enable- or
  disable-swab.
- It should be safe just to switch the flag to resolve this issue.

coldtobi



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-09 Thread Mikhail Gusarov
Hi.

Upstream Tokyo Cabinet uses little-endian data in databases and hence
the databases are portable.

After disabling the switch database format will not change for
big-endian machines, but *will* for little-endian ones, so existing
databases on little-endian machines will become unreadable and opening
those will fail with invalid metadata error.

BTW, I stand corrected in the original bug report: on big-endian
machines database format is correct.

-- 


pgpeBU6CSIGoL.pgp
Description: PGP signature


Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-08 Thread Tobias Frost
Package: libtokyocabinet9
Followup-For: Bug #667979

Hallo Mikhail,

thanks for reporting this issue. Always enable swab of course seems not not make
sense, however I cannot tell the reasons of the then-maintainer.

Basically we have three options here:
1) Go for little-endian for all archs
2) Go for big-endian fot all archs
3) Use the native endieness of the architecture

If we go for a dedicated endianess I think it should be little-endian because
the mayor archs running this endianess. So I would rule out 2)

However this sacrifies those users which are not interested in having a portable
database but running on a big-endian machine. This would be option #3 but would
sacrifies the portable db feature. (Honestly, I do not know if this feature is
widely used Therefore I'm slightly in favour of 3).

Let me know your opinion.

coldtobi



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#667979: libtokyocabinet9: TokyoCabinet got endianness in DB wrong on both big- and little-endian architectures

2012-04-07 Thread Mikhail Gusarov
Package: libtokyocabinet9
Version: 1.4.47-1
Severity: important

TokyoCabinet under Debian unconditionally uses --enable-swab option, which has
the following two effects:

- It forces tcucodec conf to say that the machine is big-endian (wrong).
- It enforces the *non-native* data endianness in DB format.

As a result, under Debian on little-endian machines Tokyo Cabinet uses
big-endian format for database, and on big-endian machines Tokyo Cabinet uses
little-endian format for database, which is

- Incompatible with any other build of Tokyo Cabinet on any other operating
  system/Linux distribution for the same endianness.

- Does not achieve database portability, which is declared as a goal for
  --enable-swab in debian/changelog

- Unnecessary slow on all architectures.

Exhibit 1 (little-endian machine):

$ uname -a  tchmgr create foobar.tcdb  hexdump -C foobar.tcdb
Linux leibnitz 3.3.0-rc6-amd64 #1 SMP Mon Mar 5 20:53:11 UTC 2012 x86_64 
GNU/Linux
  54 6f 4b 79 4f 20 43 61  42 69 4e 65 54 0a 31 2e  |ToKyO CaBiNeT.1.|
0010  30 3a 39 31 30 0a 00 00  00 00 00 00 00 00 00 00  |0:910...|
0020  00 00 04 0a 00 00 00 00  00 00 00 00 00 01 ff ff  ||
0030  00 00 00 00 00 00 00 00  00 00 00 00 00 08 11 40  |...@|
0040  00 00 00 00 00 08 11 40  00 00 00 00 00 00 00 00  |...@|
0050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
00081140
$

See the offsets 28-2f, 38-3f -- big-endian

Exhibit 2 (big-endian machine):

$ uname -a  tchmgr create foobar.tchdb  hexdump -C foobar.tcdb 
Linux smetana 2.6.32-5-sparc64-smp #1 SMP Thu Mar 22 18:46:12 UTC 2012 sparc64 
GNU/Linux
  54 6f 4b 79 4f 20 43 61  42 69 4e 65 54 0a 31 2e  |ToKyO CaBiNeT.1.|
0010  30 3a 39 31 30 0a 00 00  00 00 00 00 00 00 00 00  |0:910...|
0020  00 00 04 0a 00 00 00 00  ff ff 01 00 00 00 00 00  ||
0030  00 00 00 00 00 00 00 00  40 11 08 00 00 00 00 00  |@...|
0040  40 11 08 00 00 00 00 00  00 00 00 00 00 00 00 00  |@...|
0050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
00081140
$

Same offsets, little-endian encoding.

Due to the really unfortunate consequences of --enable-swab, priority is set to
Important.

-- System Information:
Debian Release: wheezy/sid
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'testing'), (50, 'experimental'), 
(50, 'unstable'), (50, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.3.0-rc6-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=ru_RU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages libtokyocabinet9 depends on:
ii  libbz2-1.0  1.0.6-1
ii  libc6   2.13-27
ii  zlib1g  1:1.2.6.dfsg-2

libtokyocabinet9 recommends no packages.

libtokyocabinet9 suggests no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org