On Thu, 30 Mar 2017, Ruediger Meier wrote:

On Wednesday 29 March 2017, Andi Vajda wrote:
On Wed, 29 Mar 2017, Petrus Hyvönen wrote:
Hi,

With the /DLL, sprintf(buffer, "%0*%jx", (int) hexdig, hash); and
Py_SIZE it compiles under windows (Windows 7, 64 bit)

I haven't set up for building pylucene but has another library that
I build.

For that I get a udf-8 error on:

 File
"C:\Users\phy\AppData\Local\Continuum\Anaconda3-430\conda-bld\oreki
t_1490824040916\_b_env\lib\site-packages\jcc\c pp.py", line 898, in
header
   env.strhash(signature(constructor)))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xac in
position 9: invalid start byte

That means that the "%0*%jx" change for _MSC_VER also needs a change
for the hexdig sizeof computation. There is a mismatch and garbage is
left in the buffer array.


Hm, actually "%jx" and PRIxMAX should be equivalent modifiers to print
uintmax_t (c99). According to the MSC documentation I thought that the
macro is even more safe to use.

The problem is that we do C++ here which does not need to support c99 at
all. Though MSC might be the only C++ compiler which ignores c99...

To fix it I would suggest to use non-c99 types here. Lets use "unsigned
long long" and "%llx". This may look a bit less rock-solid but good
enough for any existing systems:

I'm happy to hide this _MSC_VER hackery in conditional code.

Andi..


static PyObject *t_jccenv_strhash(PyObject *self, PyObject *arg)
{
   unsigned long long hash = (unsigned long long) PyObject_Hash(arg);
   static const size_t hexdig = sizeof(hash) * 2;
   char buffer[hexdig + 1];

   sprintf(buffer, "%0*llx", (int) hexdig, hash);
   return PyUnicode_FromStringAndSize(buffer, hexdig);
}

BTW this function should be also copied to the py2 directory where we
still use int allthough PyObject_Hash returns already long on python
2.x.

cu,
Rudi


Reply via email to