[issue47072] Database corruption with the shelve module

2022-03-27 Thread qiu


Change by qiu <1425166...@qq.com>:


--
components: +Demos and Tools -FreeBSD, Library (Lib)

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-27 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 27.03.2022 09:56, Hubert Tournier wrote:
> 
> The storage format used under Windows is completely different from the one 
> used under Unix (or *BSD).

The shelve module uses the dbm module underneath and this will pick
its storage mechanism based on what's available on the platform:

https://docs.python.org/3/library/dbm.html
https://github.com/python/cpython/blob/3.10/Lib/dbm/__init__.py

It's likely that you'll get the dbm.dumb interface on Windows.
On Linux, you typically have one of gdbm or the Berkley DB installed.

dbm.whichdb() will tell you which type of dbm implementation your
files are likely using.

More on the differences of DBM style libs:
http://www.ccl.net/cca/software/UNIX/apache/apacheRH7.0/local-copies/dbm.html

Aside: You are probably better off using SQLite with a pickle
layer to store arbitrary objects. This is much more mature than
the dbm modules.

--
nosy: +lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-27 Thread Hubert Tournier


Hubert Tournier  added the comment:

The storage format used under Windows is completely different from the one used 
under Unix (or *BSD).

Apart from the .dat datafile, there is a .dir index file with CSV lines such as 
"'key', (offset, length)".

Whereas under Unix (or *BSD), I have:

# file whois_cache.db
whois_cache.db: Berkeley DB 1.85 (Hash, version 2, native byte-order)

I'll make a test on a Linux Raspberry Pi, to see if the issue is *BSD 
specific...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-27 Thread Hubert Tournier


Hubert Tournier  added the comment:

Additional note: the test code WORKS under Windows 8.1 / Python 3.9.1 (though 
the data file is suffixed .dat instead of .db) resulting in a 4 MB database 
with 1065 records, some of them > 11 KB.

So maybe the bug is system dependent.

--
components: +FreeBSD
nosy: +koobs
versions: +Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-26 Thread Hubert Tournier


Hubert Tournier  added the comment:

I modified the test program to better reflect the size of the data structures 
stored in shelve (sys.getsizeof() which I used was far off the real size).

I saw that the database was corrupted with big records, though even bigger 
previous records had not corrupted it. Records larger than 1K (mentioned in one 
of the other problem report) were routinely OK. Records larger than 4K (also 
mentioned on another PR) were sometimes OK.

When I took a problematic record and used it single alone or with few other 
records, no corruption occurred.

Any idea?

--
Added file: https://bugs.python.org/file50704/shelve-test-3.10-b.zip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-26 Thread Hubert Tournier


Hubert Tournier  added the comment:

Hello,
Same results with Python 3.10.4:

[...]
Adding 185.220.102.6
Database has 62 records for 442368 bytes. Last record was 640 bytes long
Traceback (most recent call last):
  File "./shelve-test.py", line 84, in 
_verify_whois_cache()
  File "./shelve-test.py", line 63, in _verify_whois_cache
for key in db.keys():
  File "/usr/local/lib/python3.10/_collections_abc.py", line 881, in __iter__
yield from self._mapping
  File "/usr/local/lib/python3.10/shelve.py", line 95, in __iter__
for k in self.dict.keys():
SystemError: Negative size passed to PyBytes_FromStringAndSize
# freebsd-version -uk
13.0-RELEASE-p8
13.0-RELEASE-p10
# python3.10 --version
Python 3.10.4

The point at which the database breaks depends (from 50 to 500+ records), the 
size of the database doesn't seem to be relevant (from 400K to 1800K).

The size of the record *apparently* doesn't seem to be relevant (but I'm not 
100% sure it's the right figure), though I've had other shelve module uses 
without issues with many more records but much smaller and less complex.

--
Added file: https://bugs.python.org/file50703/shelve-test-3.10.zip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-25 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

3.8 only gets security patches.  If you can, please test with a newer version.

--
nosy: +terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-20 Thread Hubert Tournier


New submission from Hubert Tournier :

After adding a few records, the shelve module corrupts the database keys (the 
database is still readable if an element key is known, but no more iterable):

Traceback (most recent call last):
  File "./shelve-test.py", line 81, in 
_verify_whois_cache()
  File "./shelve-test.py", line 61, in _verify_whois_cache
for key in db.keys():
  File "/usr/local/lib/python3.8/_collections_abc.py", line 720, in __iter__
yield from self._mapping
  File "/usr/local/lib/python3.8/shelve.py", line 95, in __iter__
for k in self.dict.keys():
SystemError: Negative size passed to PyBytes_FromStringAndSize

I provide a short test program and data that systematically reproduces the bug. 
I added the a script showing execution messages, the resulting database in DB 
and text formats.

Tested with Python 3.8.12 on FreeBSD 13.0-RELEASE-p8.
I suppose Python is using my system package db5-5.3.28_8   
(Oracle Berkeley DB, revision 5.3).

See also similar issues:
https://bugs.python.org/issue33074
https://bugs.python.org/issue30388

--
components: Library (Lib)
files: shelve-test.zip
messages: 415625
nosy: HubTou
priority: normal
severity: normal
status: open
title: Database corruption with the shelve module
type: behavior
versions: Python 3.8
Added file: https://bugs.python.org/file50693/shelve-test.zip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com