Re: Thread handle leak in APR
On Fri, Aug 20, 2010 at 8:30 PM, William A. Rowe Jr. wr...@rowe-clan.net wrote: On 8/20/2010 1:08 PM, Guenter Knauf wrote: Hi Prathima, Am 20.08.2010 09:04, schrieb Prathima Dandapani -X (pdandapa - HCL at Cisco): Steps used to compile Apache is as follows. Run Setenv.cmd from the Windows-Server-2003-R2-Platform-SDK installed directory Run vcvars32.bat from VC++ installed directory. AFAICT this is the wrong order; first you need to call vcvars32.bat, then Setenv.cmd so that the PSDK paths are prepended. Precisely :) Will reversing the order make his leak disappear? Regards, Erik.
Re: poll/select process's IO on windows
Hi Jack, Regardless of the portability library that you're using, Windows doesn't allow select() operations on any other than socket handles. I don't have personal experience how to work around that issue with APR though, I'm sorry. With kind regards, Erik. On Tue, Jun 16, 2009 at 8:42 PM, Jack Andrewseffb...@gmail.com wrote: hi, i want to detect IO on a child process stdio as well as a socket from elsewhere. ie. i want a select() on /windows/. is it possible with APR? or do i have to hack around? my first attempt is here: int main(int c,char** argv) { apr_procattr_t *attr; const char*args[]={cmd.exe,0}; apr_sockaddr_t *sa; apr_socket_t *sock; apr_file_t *fds[4]; apr_pollfd_t pfd[4],*pfdout; apr_pollset_t *pollset; fds[0]=(apr_file_t*)sock; AS(apr_procattr_create(attr, pool)); AS(apr_procattr_io_set(attr, APR_FULL_BLOCK,APR_FULL_BLOCK,APR_FULL_BLOCK)); AS(apr_procattr_cmdtype_set(attr, APR_PROGRAM)); AS(apr_proc_create(newproc, c:\\Windows\\system32\\cmd.exe, args, 0, attr, pool)); fds[1]=newproc.in;fds[2]=newproc.out;fds[3]=newproc.err; AS(apr_pollset_create(pollset, 4, pool, APR_POLLSET_WAKEABLE)); #define setpfd(x,a,b,c,d) pfd[x].desc_type=(a),pfd[x].reqevents=(b),pfd[x].desc.s=(c),pfd[x].client_data=(d) for(i=0;i4;i++) { setpfd(i,i?APR_POLL_FILE:APR_POLL_SOCKET,APR_POLLIN|APR_POLLOUT|APR_POLLHUP,(apr_socket_t*)fds[i],NULL); AS(apr_pollset_add(pollset,pfd+i)); } for(;APR_TIMEUP==apr_pollset_poll(pollset,0,n,pfdout);) { printf(hoo\n); } return 0; } ta, jack.
Re: apr pools memory leaks
On Wed, Oct 1, 2008 at 8:31 PM, Mark Phippard [EMAIL PROTECTED] wrote: On Wed, Oct 1, 2008 at 2:11 PM, Ben Collins-Sussman [EMAIL PROTECTED] wrote: I have interesting memory leak data to share with these two lists (crossposting to both svn and apr dev lists). Ever since we launched svn-on-bigtable over at Google (about 2 years ago), we've been struggling with mysterious memory leaks in apache -- very similar to what users are complaining about in Subversion issue 3084. After lots of analysis, here's what we've figured out so far. It is good to see some analysis on this issue. Here is link BTW: http://subversion.tigris.org/issues/show_bug.cgi?id=3084 A couple questions: 1) This seems to happen only with Apache 2.2 and not 2.0. Is there any explanation for that supported by your analysis? 2) It seems like many of the people, at least on Windows, can reproduce this problem quickly. Could this just be due to running requests which create/destroy a lot of memory? 3) Any reason more Windows users would see this than Linux? Maybe more Windows SVN users use Apache 2.2 than on Linux? Windows doesn't support prefork mode; only threaded operation. On Linux/Unix the default mode of operation of Apache is some sort of creation of disposable processes. The threaded operation in Windows doesn't have that (a disposable process which cleans up any memory management issues). Bye, Erik.
Re: Opaque structures in general (was Re: Opaque apr_pool_t structure)
On 6/6/08, Ryan Bloom [EMAIL PROTECTED] wrote: I don't think that there is any reason to not have a sizeof() function, other than any code that does play with the pointers will be non-portable code. The reason that I originally went with opaque data structures (I did it before giving the code to the ASF), was that most of the structures are defined totally differently on each platform. By making the structures opaque, it became much harder for a developer to write code with APR that worked on some APR platforms, but not others. If you play with the pointers, your code is very likely to work only on the platforms that you code on. But, I would like to hear from some of the active developers about this as well. Well, as soon as you provide its size, it's not completely opaque anymore, now is it :-) I think the entire issue is centered around the fact that Yann doesn't really want to play by the pool-rules... I've been subscribed to this list a few years now. I've heard people regretting that APR uses pools for all of its portability functionality. I've seen the Subversion devs juggle with pool life time problems, but I've never heard anybody wanting to copy pools. Having done quite my share of pool life time juggling myself, I really don't understand the desire to copy pools around: you have the pointer, you don't know what else has a pointer... Bye, Erik.
Request for inclusion in 1.2.13/.14 or 1.3.0: Win32 file copy efficiency
Bug #44193 includes a (one line!) patch and describes a situation where network apr_file_copy() results in very bad performance for network mounted filesystems. The issue is that APR uses a 512 bytes large buffer (BUFSIZ) to read and write data in the copy operation. Performance would benefit greatly from using APR_FILE_BUFSIZE, which is defined as 4096 bytes. Could someone have a look before shipping new releases from the 1.2, 1.3 or even 0.9 branches? Thanks in advance! Bye, Erik.
Re: [patch] creating directories with long names
On Tue, Feb 5, 2008 at 8:57 PM, William A. Rowe, Jr. [EMAIL PROTECTED] wrote: Stefan Küng wrote: Anyone looked at this yet? http://mail-archives.apache.org/mod_mbox/apr-dev/200801.mbox/[EMAIL PROTECTED] The patch/argument made perfect sense, although I'm leaning towards a switch to decide if it's a file or path name (if it is even possible for that to work). I'll review and commit. We're seeing this problem more and more often popping up on [EMAIL PROTECTED] Any progress on the issue? Thanks in advance. Bye, Erik.
misc/win32/rand.c needs to include apr.h *before* windows.h
I thought I'd forward the APR bugreport in the PS in the message below to the APR list (as the Subversion list won't be able to address the issue). Bye, Erik. -- Forwarded message -- From: Stefan Küng [EMAIL PROTECTED] Date: Nov 2, 2007 4:53 PM Subject: [PATCH] fix build with IPv6 support on Windows To: [EMAIL PROTECTED] Hi, On Windows, building Subversion with IPv6 support enabled currently fails because two windows specific files don't include winsock2.h before windows.h. The attached patch fixes this. Stefan P.S. There's also one change needed to the apr libraries (the file misc/win32/rand.c needs to include apr.h *before* windows.h) to get the build running. But once that include line is moved two lines up, it works.
Re: [1.2.x Win32] apr.dsp references non-existing buffer.c?
On 10/8/07, William A. Rowe, Jr. [EMAIL PROTECTED] wrote: Erik Huelsmann wrote: While (for the first time in a looong time) trying to build APR, I built the 1.2.x branch on Win32 tonight. I'm however getting an error from the DSP file: it expects a file file_io/win32/buffer.c to exist, yet it doesn't... Removing that file from the solution and rebuilding fixes the problem. There's no such reference on 1.2.x branch, sure you had svn up'ped and svn revert'ed any local changes? Yup. But to be sure, I reverted again (file list turned up empty), restarted VS and reran 'build apr'. The output is this: 1-- Build started: Project: apr, Configuration: Debug Win32 -- 1Compiling... 1buffer.c 1c1 : fatal error C1083: Cannot open source file: '.\file_io\win32\buffer.c': No such file or directory 1Build log was saved at file://c:\Users\Erik\Documents\apr-1.2.x\LibD\BuildLog.htm 1apr - 1 error(s), 0 warning(s) == Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped == Hope you can help! bye, Erik.
Re: [1.2.x Win32] apr.dsp references non-existing buffer.c?
On 10/8/07, William A. Rowe, Jr. [EMAIL PROTECTED] wrote: Erik Huelsmann wrote: 1Build log was saved at file://c:\Users\Erik\Documents\apr-1.2.x\LibD\BuildLog.htm Hope you can help! svn status apr.dsp svn info apr.dsp Turned out there were some unversioned files left behind after all. Sorry, but thanks very much for your time! bye, Erik.
Re: [vote] apr 1.2.x win32 compatibility
On 10/5/07, William A. Rowe, Jr. [EMAIL PROTECTED] wrote: Paul Querna wrote: Revert. New Macros are new APIs, and must wait for 1.3.x. Ok, as people are totally missing the point, that I was calling a vote for a modest versioning rules change that was in the spirit of both the versioning rules and of the portability principal... I've reverted, vote withdrawn. End of 1.2 behavioral corrections, as least from me. Eric, w.r.t. to your patch to httpd, keep it, as behavioral changes to apr 1.2 appear to be out of the question. PLEASE consider committing or submitting your patch to 1.3 to make default timeout behavior the same for Win32 as for Unix. Doesn't this suggest APR needs a 1.3 release soon? I mean, if you can't fix it in a stable release line, you can always resort to doing a minor release to fix the problems, right? bye, Erik.
Re: EISDIR / EPERM not categorised in APR_STATUS_IS_* macros
On 9/13/07, Joe Orton [EMAIL PROTECTED] wrote: On Tue, Sep 11, 2007 at 11:25:20PM +0200, Erik Huelsmann wrote: While working on changing some of the working copy library in Subversion, I'm trying to eliminate stat() calls (apr_stat()) in order to speed up overall operation. Very often, Subversion libsvn_wc uses this pattern: svn_io_check_path() - retrieves a path kind using apr_stat() if (kind == directory) do_dir_thing(); else do_file_thing(); Where dir and file thing could be just removal. The common case in the routines I'm looking at happens to be file removal. So, to reduce the number of stats, I want to replace the stat()+unlink() with a single unlink(), checking the result for certain types of failure. Is this when dealing with files discovered by use apr_dir_*()? (i.e. why isn't the filetype hint used) Not in all cases: in most cases it loops over the list of versioned items ith the current directory (which are stored in a file). That list holds the type of the entry that *should* be on disk, but, ofcourse, that's not a guarantee we'll actually find a file on disk if the versioned item is a file. Reading the man 2 unlink page, EPERM is returned for POSIXy unices when deleting a directory, whereas EISDIR is returned on Linux. (MSDN isn't quite clear about Win32 behaviour...) Neither EPERM nor EISDIR have been categorised in any of the APR_STATUS_IS_* macros. I searched the mailing list archives at MARC, but found only an old mail by Branko which suggests this part of the error codes is unfinished business. Now, before I start working up any patches: Is that true, or is this missing with intent? Hmm, what exactly are you looking for here? An APR_STATUS_* code which means exactly this apr_status_t value if returned by apr_file_remove() means the file is a directory? Or just an APR_STATUS_IS_ESDIR() wrapper which is 0 #ifndef EISDIR? The former (ie the APR_STATUS_* code/macro to check the return value of apr_file_remove()). bye, Erik.
EISDIR / EPERM not categorised in APR_STATUS_IS_* macros
While working on changing some of the working copy library in Subversion, I'm trying to eliminate stat() calls (apr_stat()) in order to speed up overall operation. Very often, Subversion libsvn_wc uses this pattern: svn_io_check_path() - retrieves a path kind using apr_stat() if (kind == directory) do_dir_thing(); else do_file_thing(); Where dir and file thing could be just removal. The common case in the routines I'm looking at happens to be file removal. So, to reduce the number of stats, I want to replace the stat()+unlink() with a single unlink(), checking the result for certain types of failure. Reading the man 2 unlink page, EPERM is returned for POSIXy unices when deleting a directory, whereas EISDIR is returned on Linux. (MSDN isn't quite clear about Win32 behaviour...) Neither EPERM nor EISDIR have been categorised in any of the APR_STATUS_IS_* macros. I searched the mailing list archives at MARC, but found only an old mail by Branko which suggests this part of the error codes is unfinished business. Now, before I start working up any patches: Is that true, or is this missing with intent? Thanks! bye, Erik.
Re: question about APR_IS_DEV_VERSION
On 8/8/07, Guenter Knauf [EMAIL PROTECTED] wrote: Hi all, regardless at which APR version I look at - everywhere I find: /** * This symbol is defined for internal, development copies of APR. This * symbol will be #undef'd for releases. */ #undef APR_IS_DEV_VERSION so it seems to me that this flag is currently not used, and: /** Internal: string form of the is dev flag */ #ifdef APR_IS_DEV_VERSION #define APR_IS_DEV_STRING -dev #else #define APR_IS_DEV_STRING #endif does never apply...; what's the reason that we dont use this with APR? Have you looked in APR trunk? (http://svn.apache.org/viewvc/apr/apr/trunk/include/apr_version.h?revision=428317view=markup) bye, Erik.
Re: apr_filepath_encoding on Darwin
- if Darwin has a configurable locale, does *not* set this up by default such that nl_langinfo(CODESET) returns UTF-8, but does by policy require filenames in UTF-8, regardless of locale, I would agree with changing apr_filepath_encoding as Erik proposed. That is the case? I don't know what the BSD locale system (nl_langinfo , whatever) does in Darwin; I've never worked with it. I only know that for file names, we tell developers to use UTF-8. -wsv I'd like to conclude this thread then by supplying a patch which applies equally to trunk and 0.9.x with patch -p1, to return UTF8 paths on Darwin. Note that I wasn't able to test the change myself, as I don't have access to Darwin, but the code change is trivial. (I just wonder if I got the #if-s right.) Bye, Erik. (I attached the patch because Google often munges them, but if you want, I can supply it inline too.) Index: trunk/file_io/unix/filepath.c === --- trunk/file_io/unix/filepath.c (revision 562753) +++ trunk/file_io/unix/filepath.c (working copy) @@ -305,6 +305,10 @@ APR_DECLARE(apr_status_t) apr_filepath_encoding(int *style, apr_pool_t *p) { +#if defined(DARWIN) +*style = APR_FILEPATH_ENCODING_UTF8 +#else *style = APR_FILEPATH_ENCODING_LOCALE; +#endif return APR_SUCCESS; }
Re: SVN CLI and Windows security
On 7/25/07, Brad Stiles [EMAIL PROTECTED] wrote: I ran into a situation today concerning the command line client and windows security. The problem is that the user (a build user performing automated builds on a Windows 2003 VM) attempted to check out a file to a network share and couldn't, apparently because that user doesn't have access to the share root, even though it does have access to the directory into which the file should be exported. The command I used was: svn export http://server/trunk/dir1/dir2/file.txt \\server\share\dir1\dir2\dir3\file.txt --non-interactive --force To which svn.exe responded: svn: Error resolving case of '\\server\share\dir1\dir2\dir3\file.txt' In that tree, the user has no rights to \\server\share\, but has read/write access to dir1 and below. If, as that user, I do 'dir \\server\share', I see nothing, but if I do 'dir \\server\share\dir1', I see everything just fine. Now, in our case, that shouldn't be a problem much longer because the network guys are in the process of changing the access rights on that directory; the build user really should have access there. However, I don't think it's unreasonable to assume that there might be a legitimate reason for such a restriction. Is there something we did wrong here (aside from the rights issue) that would have allowed the export to happen despite the access rights? FWIW, 'svn checkout' resulted in the same type of error. When the network guy saw it, he said, Maybe they're using the old NT4 method of stepping through the tree, rather than going straight to the location specified. Is there something in whatever layer svn is using for this that accesses the tree step by step rather than as one entity? And is it worth fixing if it is a problem? Well, it's an error that finds its roots in a call to apr_filepath_merge with an argument of APR_FILEPATH_TRUENAME. Looking at that function, it calls apr_filepath_root, which tries to isolate the root in order to see whether it actually exists. That scheme breaks down when there are no read/write permissions. I have no idea whether the APR project feels this is something worth to be fixed, or even fixeable. I cc'ed their dev@ list, to make them aware if the weren't already. Hope that answers your question. bye, Erik.
Re: SVN CLI and Windows security
On 7/29/07, William A. Rowe, Jr. [EMAIL PROTECTED] wrote: Erik Huelsmann wrote: On 7/25/07, Brad Stiles [EMAIL PROTECTED] wrote: I ran into a situation today concerning the command line client and windows security. The problem is that the user (a build user performing automated builds on a Windows 2003 VM) attempted to check out a file to a network share and couldn't, apparently because that user doesn't have access to the share root, even though it does have access to the directory into which the file should be exported. The command I used was: svn export http://server/trunk/dir1/dir2/file.txt \\server\share\dir1\dir2\dir3\file.txt --non-interactive --force To which svn.exe responded: svn: Error resolving case of '\\server\share\dir1\dir2\dir3\file.txt' In that tree, the user has no rights to \\server\share\, but has read/write access to dir1 and below. If, as that user, I do 'dir \\server\share', I see nothing, but if I do 'dir \\server\share\dir1', I see everything just fine. Now, in our case, that shouldn't be a problem much longer because the network guys are in the process of changing the access rights on that directory; the build user really should have access there. However, I don't think it's unreasonable to assume that there might be a legitimate reason for such a restriction. Is there something we did wrong here (aside from the rights issue) that would have allowed the export to happen despite the access rights? FWIW, 'svn checkout' resulted in the same type of error. When the network guy saw it, he said, Maybe they're using the old NT4 method of stepping through the tree, rather than going straight to the location specified. Is there something in whatever layer svn is using for this that accesses the tree step by step rather than as one entity? And is it worth fixing if it is a problem? Well, it's an error that finds its roots in a call to apr_filepath_merge with an argument of APR_FILEPATH_TRUENAME. Looking at that function, it calls apr_filepath_root, which tries to isolate the root in order to see whether it actually exists. That scheme breaks down when there are no read/write permissions. This presumes it's an error at all. I didn't mean that. I meant to say it's an error *message* that ... (etc) The _TRUENAME flag tells us that we MUST use the canonical path. The *only* way to accomplish this is with read/traverse access through the file path. The arguements for this approach is that an app (such as a server) often needs to deal with the single proper name of a resource. Program Files may be Progra~1, it may also be Progra~2, or on netware, ProgramF. I have no idea whether the APR project feels this is something worth to be fixed, or even fixeable. I cc'ed their dev@ list, to make them aware if the weren't already. Of course it won't be fixed, because it is not broken. That's what I meant by fixeable. At first glance it looks rather broken (in the Win32 api) that the only way to retrieve the canonical path is to traverse the entire tree from that path up. But, alas, that's not up for discussion here. I hope Brad got his answer; I'll move on. Thanks for your time, Bill. bye, Erik.
apr_filepath_encoding on Darwin
Reading [1], I conclude that applications should pass UTF-8 to BSD functions such as stat() at all times. This suggests to me that apr_filepath_encoding() should return APR_FILEPATH_ENCODING_UTF8. Yet, looking at the sources, on any Unixy system, it returns APR_FILEPATH_ENCODING_LOCALE. Is this an oversight, or am I missing something else? bye, Erik. [1] http://developer.apple.com/documentation/MacOSX/Conceptual/BPInternational/index.html#//apple_ref/doc/uid/1171i
Re: apr_filepath_encoding on Darwin
On 7/17/07, Joe Orton [EMAIL PROTECTED] wrote: On Tue, Jul 17, 2007 at 02:14:25PM +0200, Erik Huelsmann wrote: Reading [1], I conclude that applications should pass UTF-8 to BSD functions such as stat() at all times. This suggests to me that apr_filepath_encoding() should return APR_FILEPATH_ENCODING_UTF8. Yet, looking at the sources, on any Unixy system, it returns APR_FILEPATH_ENCODING_LOCALE. Is this an oversight, or am I missing something else? This is deliberate; on Unix the character set used for filenames is dictated by the locale settings (e.g. LC_CTYPE), by convention. There is certainly no Unix standard which dictates that all filenames must be UTF-8-encoded Unicode, so APR cannot enforce that. I didn't mean to imply this should apply to any unix, but the Darwin docs do seem to say that this is the standard on Darwin. Doesn't APR adhere to what's standard on Windows in its Windows incarnation? By analogy, shouldn't it do the same for Darwin? I'm not suggesting this should apply to any other Unix than Darwin and the docstring for apr_filepath_encoding allows it to return other values than _LOCALE. bye, Erik.
Re: apr_filepath_encoding on Darwin
On 7/17/07, Joe Orton [EMAIL PROTECTED] wrote: On Tue, Jul 17, 2007 at 02:31:27PM +0200, Erik Huelsmann wrote: On 7/17/07, Joe Orton [EMAIL PROTECTED] wrote: This is deliberate; on Unix the character set used for filenames is dictated by the locale settings (e.g. LC_CTYPE), by convention. There is certainly no Unix standard which dictates that all filenames must be UTF-8-encoded Unicode, so APR cannot enforce that. I didn't mean to imply this should apply to any unix, but the Darwin docs do seem to say that this is the standard on Darwin. Presuming that Darwin still has a configurable locale like any other Unix, is the standard really anything other than a default? Well, as I quoted in my post to Lucian from the url I posted before: All BSD system functions expect their string parameters to be in UTF-8 encoding and nothing else. That doesn't seem like a default to me, more like a requirement, but if I were certain, I would have posted a patch instead of an inquiry. bye, Erik.
Re: apr_filepath_encoding on Darwin
On 7/17/07, Lucian Adrian Grijincu [EMAIL PROTECTED] wrote: On 7/17/07, Erik Huelsmann [EMAIL PROTECTED] wrote: On 7/17/07, Lucian Adrian Grijincu [EMAIL PROTECTED] wrote: APR holds everything in locale encoding on UNIX and UTF8 on Windows. As far as I know UNIX system calls must accept *locale* strings, not an arbitrary encoding (be it UTF8 or other). This may be Mac related, could you point out exactly from where you deduced that strings passed to stat() must be UTF8 ... that's a mighty large document to digest :) Sure. First sentence of the last paragraph under File Systems and Unicode Support: I think you mean this page: http://developer.apple.com/documentation/MacOSX/Conceptual/BPInternational/Articles/FileEncodings.html Sorry, yes, I do. If I'm not mistaking the text before this paragraph talks about the encoding used to represent filenames inside the file system. But this is not related to the locale (well if a locale used names not representable in the encoding of the file system you may get corrupted data or what else). So, please can you confirm that you can set a locale different than UTF-8? No, I can't confirm: I'm trying to find out why Subversion is the only app complaining there's no locale set when interacting with the filesystem while the rest of the world seems perfectly fine with reading UTF-8 filenames. [But I don't own a Mac myself either, though I sure can find someone to test for me, later today.] Subversion generates an error when there's no locale setting *and* APR returns APR_FILEPATH_ENCODING_LOCALE, but everybody kept on insisting there's no reason, because there's no locale involved in OS X file names. (Presenting to the user, maybe, but not for storing on disk.) I've never worked on a Mac, but can you set the locale to something else than UTF-8? Erik. bye, Erik.
Re: apr_filepath_encoding on Darwin
On 7/17/07, Lucian Adrian Grijincu [EMAIL PROTECTED] wrote: On 7/17/07, Erik Huelsmann [EMAIL PROTECTED] wrote: No, I can't confirm: I'm trying to find out why Subversion is the only app complaining there's no locale set when interacting with the filesystem while the rest of the world seems perfectly fine with reading UTF-8 filenames. [But I don't own a Mac myself either, though I sure can find someone to test for me, later today.] Subversion generates an error when there's no locale setting *and* APR returns APR_FILEPATH_ENCODING_LOCALE, but everybody kept on insisting there's no reason, because there's no locale involved in OS X file names. (Presenting to the user, maybe, but not for storing on disk.) Hmm, this sounds like an error of interpreting things from subversion's point of view. Why? Subversion shouldn't need to know about this Darwin convention. That's why we depend on APR. If APR tells us to check the locale, we do. If it tells us we can write UTF8, we do and since we're UTF8 internally, we don't need to do character translation (meaning we don't need the locale system to be initialized in that case). One thing you could try is to see whether compiling APR to return APR_FILEPATH_ENCODING_UTF8 fixes your problem or not. Oh, that will fix the problem, because it will not try character translation if it gets _UTF8. *IF* UTF8 is the only locale on Macs, APR_FILEPATH_ENCODING_UTF8 seems like the right value to return, because it's more explicit than APR_FILEPATH_ENCODING_LOCALE, but then again, I haven't looked to see how code interacts with these values. Well, I was arguing that if the filepath encoding is independent of the locale, returning APR_FILEPATH_ENCODING_LOCALE would be the wrong thing to return. I'll try to get in contact with someone with experience developing on OS X and I'll get back to this discussion. bye, Erik.
Re: apr_filepath_encoding on Darwin
On 7/17/07, Wilfredo Sánchez Vega [EMAIL PROTECTED] wrote: On Jul 17, 2007, at 5:25 AM, Joe Orton wrote: On Tue, Jul 17, 2007 at 02:14:25PM +0200, Erik Huelsmann wrote: Reading [1], I conclude that applications should pass UTF-8 to BSD functions such as stat() at all times. This suggests to me that apr_filepath_encoding() should return APR_FILEPATH_ENCODING_UTF8. Yet, looking at the sources, on any Unixy system, it returns APR_FILEPATH_ENCODING_LOCALE. Is this an oversight, or am I missing something else? My understanding is that in Darwin/Mac OS, all file names, when accessed above the VFS layer, are, by convention, decomposed UTF-8. This is confirmed by the Tech Note: http://developer.apple.com/qa/qa2001/qa1173.html At the top: In Mac OS X's VFS API file names are, by definition, canonically decomposed Unicode, encoded using UTF-8. Which suggests that apr_filepath_encoding should return APR_FILEPATH_ENCODING_UTF8, if I'm not mistaken. Under Returning Names, it is clear that the file system implementation is expected to convert the on-disk file name encoding (if known) to decomposed UTF-8: When returning names to higher layers (for example, from your VOP_READDIR entry point), you should always return decomposed names. If your underlying volume format uses precomposed names, you should convert any precomposed characters to their decomposed equivalents before returning them to the system. Note that the above is considerably easier for a file system like HFS+, where we know the on-disk encodings. It's trickier for any file system which doesn't specify the file name encoding, which unfortunately is most. It's particularly tricky when the volume format is shared across different operating systems, since other systems do not, AFAICT, have well-established conventions for file name encoding (*). Note also that the convention is not enforced per se (**). As a result, you aren't guaranteed, even on Mac OS, that file names are valid UTF-8 (***). That poses interesting problems. For example, CFString (and therefore basically all Mac apps) have been known to barf (and crash) when given a file name which isn't UTF-8, since it is typically told that it is UTF-8. But generally applications won't work with these non-UTF8 paths if they are well behaving MacOSX apps themselves, right? That reduces chances of being fed garbage. But, other OSes can't guarantee UTF-8ness either, because LANG (and LC_CTYPE) can be user-settings, which can differ for different users, but path names are the same for all users. So on Linux you can't be too sure either. Thanks for the extensive explanation. bye, Erik.
Re: apr-util/misc/apr_thread_pool.c compile failure in VC6
On 5/1/07, Curt Arnold [EMAIL PROTECTED] wrote: apr_thread_pool.c did not appear in the VC6 project in 1.2.8, but has been added in the SVN HEAD and it fails to compile under VC6 in debug mode due to a C++ style variable declaration. The following change fixes the compile error. Shouldn't the moved variable be in conditionals too? This will introduce a warning when *not* compiling with NDEBUG, right? bye, Erik. Index: apr_thread_pool.c === --- apr_thread_pool.c (revision 534087) +++ apr_thread_pool.c (working copy) @@ -595,6 +595,7 @@ static void wait_on_busy_threads(apr_thread_pool_t * me, void *owner) { +apr_os_thread_t *os_thread; struct apr_thread_list_elt *elt; apr_thread_mutex_lock(me-lock); elt = APR_RING_FIRST(me-busy_thds); @@ -605,7 +606,6 @@ } #ifndef NDEBUG /* make sure the thread is not the one calling tasks_cancel */ -apr_os_thread_t *os_thread; apr_os_thread_get(os_thread, elt-thd); #ifdef WIN32 /* hack for apr win32 bug */
Re: Fix writing large chunks to consoles/pipes
On 3/22/07, Jonathan Gilbert [EMAIL PROTECTED] wrote: At 10:51 AM 3/21/2007 +0100, you wrote: There's a thread about fixing some issue in Subversion w.r.t. writing large amounts of data to a console handle. This issue is an example of a broader issue: Windows doesn't support arbitrary amounts of data to be written to all of its handle types. Smaller writes *are* supported, but it doesn't automatically fall back to smaller writes when the underlaying handle only supports small writes. That is correct, as I understand it. Hi Jonathan, Thanks for the time you put into this. I was however not meaning to ask for your reaction, but rather one by the committers (as we both are not). Since in the end they decide what goes in, I was asking what I can do to help resolve the issue to their satisfaction. I'm still hoping to find a committer committed to this problem enough to help us out here - if only by providing some help on additional requirements! There were a number of patches contributed until now, with the following properties (please correct me if I'm wrong or incomplete): 1) Split all writes into 'always supported' smaller writes My original patch. Note that while it used the buffering functionality, the handle wasn't technically buffered in the sense of combining separate calls to apr_file_write. 2) Split writes for handles with specific numbers into 'always supported' smaller writes Well, it wasn't done by numbers, but that was my second patch. It reworked how specifically console handles are opened, including some factoring of common std handle code that would be nice as a stand-alone patch even if the approach is rejected. 3) Fall back to a buffered file once the condition has been detected which signals this kind of problem This was my third patch, but it isn't entirely correct to refer to the file as buffered for the same reason as (1). I'll explain below. 4) Fall back to a small write (once) when detecting this condition, returning an incomplete write This was proposed by Joe and implemented in the patch you posted to the list. (1) was rejected because the increased performance on 'other' handles must be an OS issue not to be worked around in a portability layer Honestly, I think (1) should be applied. You wouldn't normally write code in such a way because you would expect it to decrease performance. My testing showed, however, that -- at least for file handles -- the Win32 APIs actually perform *better* with a larger number of smaller writes. It is a very odd result, but it was consistent across more than 10 hours of continuous testing, both with preallocated files and with the file blocks being allocated on-the-fly. My test compared writing 55 megabyte files in 48KB blocks with writing the same files as a combination of each possible whole-megabyte write size from 1 to 10, shuffling the order of the large writes and randomly interleaving the two for each iteration, and at the end of it, every single test iteration showed the 48 KB writes to be around 3 times faster than the large writes for the same amount of total data written. (2) was rejected, because it doesn't solve the issue when other handle numbers are of the same class There are only 3 console handles, and they are all opened through the methods that I modified. The only question is whether stand-alone pipes might behave the same way as the console std handles (which would be a fair indication that console std handles are in fact pipes :-). Certainly files do not behave that way, but, as described above and in previous messages, files would actually perform better with writes cut up. (3) Wasn't rejected (yet?), but feels wrong to me: an unbuffered file was created, then it should probably stay unbuffered (since starting to buffer will change the behavior of the file significantly on smaller writes) This is where the confusion lies. If you look closely at my third patch, it introduces a new BOOLEAN field in the (opaque) apr_file_t structure called autoflush. This is combined with two extra lines of code in apr_file_write so that when this field is set, after the last chunk of the input buffer is sent to the apr_file_t's buffer (which may be the only chunk if it is a small write), apr_file_flush is called. Thus, the behaviour remains the same; apr_file_write doesn't return until the data provided has guaranteed to have been passed to the OS. The buffering code is used because it already divides the data up in a way that is known to work (though it is actually fairly inefficient, with an unnecessary complete copy of the data from the user-supplied buffer into the apr_file_t's buffer), but the file handle isn't actually made to behave like a buffered file. I don't understand the resistance to (1), but if (1) won't be applied, I think (3) is the best option. (4) Was rejected because the same condition was detected over and over again, resulting in too many failed OS calls. Patch #4 can be easily extended to cache
Fix writing large chunks to consoles/pipes
There's a thread about fixing some issue in Subversion w.r.t. writing large amounts of data to a console handle. This issue is an example of a broader issue: Windows doesn't support arbitrary amounts of data to be written to all of its handle types. Smaller writes *are* supported, but it doesn't automatically fall back to smaller writes when the underlaying handle only supports small writes. I'd say the apr_file_write() API allows for partial writes, meaning the internals on Windows can compensate for the behaviour in Windows (by falling back to smaller writes when detecting certain error return codes). There were a number of patches contributed until now, with the following properties (please correct me if I'm wrong or incomplete): 1) Split all writes into 'always supported' smaller writes 2) Split writes for handles with specific numbers into 'always supported' smaller writes 3) Fall back to a buffered file once the condition has been detected which signals this kind of problem 4) Fall back to a small write (once) when detecting this condition, returning an incomplete write (1) was rejected because the increased performance on 'other' handles must be an OS issue not to be worked around in a portability layer (2) was rejected, because it doesn't solve the issue when other handle numbers are of the same class (3) Wasn't rejected (yet?), but feels wrong to me: an unbuffered file was created, then it should probably stay unbuffered (since starting to buffer will change the behavior of the file significantly on smaller writes) (4) Was rejected because the same condition was detected over and over again, resulting in too many failed OS calls. Patch #4 can be easily extended to cache the result of the fallback in the file structure. Would that be enough to get this change committed / issue resolved? If not, what more would you expect? bye, Erik.
Re: [PATCH] Fix for Subversion issue #1789
My gut says the extensive additional code to flip to buffered mode merits it's very own function (internal to apr would be fine, for now, but the fact that we need to trigger it hints that others might want to too, in some future 1.3.0 release.) We can't return 0 bytes written, too many apps will presume 0 means file closed, other signals tripped, error etc. We should do at least one short-write before returning in either case. Thoughts? I can live with the proposed solution, though I think the case of writing to consoles (and pipes?) is such an edge-case that it doesn't really warrent the enormous code-complexity proposed. Especially consoles are low-throughput devices anyway (making 2n+1 WriteFile calls no real problem). Again, I can live with the proposed solution. bye, Erik.
Re: [PATCH] Fix for Subversion issue #1789
On 2/23/07, William A. Rowe, Jr. [EMAIL PROTECTED] wrote: Jonathan Gilbert wrote: The updated version of the patch only enforces chunked writes specifically for console output, and it does it using the existing buffering functionality, so no new code is required, unlike the original patch. But unlike your original patch, you begin buffering output that the author didn't intend to buffer? You assume stdhandles are consoles. Other objects can be similarly pipe-oriented streams that will blow up the same way, we might as well fix all these cases now if we are investing the time in this patch. Yet stdhandles aren't necessarily consoles and your patch tromps those cases of redirected standard handles with this assumption. It appears your proposed patch is moving in the wrong direction from where you started, IMHO. Can we return to the earliest comments and react to the case from the moment where we first flood NT's pipe buffer, and we must back down to 32k writes? Jonathan, Any progress on the matter? I'd love to be able to close the issue in Subversion by pointing to the right APR version! bye, Erik.
Re: [PATCH] Fix for Subversion issue #1789
On 3/14/07, Jonathan Gilbert [EMAIL PROTECTED] wrote: At 02:22 PM 3/14/2007 +0100, Eric Huelsmann wrote: Jonathan, Any progress on the matter? I'd love to be able to close the issue in Subversion by pointing to the right APR version! My feeling at the moment is that the patch is ready to be applied, but Bill Rowe does not agree. I did some performance tests to see just what the effect of buffering would be, and when run on UNIX systems, I saw the expected ~4% degradation when splitting up large writes into 30 KB blocks. When run on Windows, however, I saw a massive performance *increase*, a factor of more than 3. So, a patch which unconditionally breaks writes up into 30 KB blocks would apparently be beneficial for Windows in all cases. There are currently two patches out there that I've submitted; one of them modifies the Win32 apr_file_write directly so that all writes are split, and the other one uses APR's built-in buffering on all console-related handles (and console handles only). Either one would fix the problem of large writes to consoles, but some people seem to think it would be better to always try the large write and then switch to buffered output if the write attempt fails with the Insufficient buffer space error. Yes, that's the general idea I got from Joe. I think in principle this makes sense, but in practice it doesn't since there is actually a performance penalty on the platform for doing large writes in single calls to WriteFile. Possibly, but it's also an edge-case to use large to even extremely large writes on console handles, so I do really see some benefit in not making this change more complex than necessary. To help this change get back on track, I created my own patch (but note: I don't have a Windows dev environment!). This patch does nothing but fall back to a smaller write when there is an insufficient buffer error *and* the write was larger than the chunk you've determined to be writeable (ie 30kB). This is it: Index: file_io/win32/readwrite.c === --- file_io/win32/readwrite.c(revision 518377) +++ file_io/win32/readwrite.c(working copy) @@ -312,6 +312,10 @@ } rv = WriteFile(thefile-filehand, buf, (DWORD)*nbytes, bwrote, thefile-pOverlapped); +if (rv == ERROR_INSUFFICIENT_BUFFER *nbytes (32*1024)) + rv = WriteFile(thefile-filehand, buf, (DWORD)(32*1024), bwrote,+ thefile-pOverlapped); + if (thefile-append) { apr_file_unlock(thefile); apr_thread_mutex_unlock(thefile-mutex); @@ -320,6 +324,9 @@ else { rv = WriteFile(thefile-filehand, buf, (DWORD)*nbytes, bwrote, thefile-pOverlapped); +if (rv == ERROR_INSUFFICIENT_BUFFER *nbytes (32*1024)) + rv = WriteFile(thefile-filehand, buf, (DWORD)(32*1024), bwrote,+ thefile-pOverlapped); } if (rv) { *nbytes = bwrote; @@ -485,7 +492,13 @@ numbytes = (DWORD)bytesleft; } -if (!WriteFile(thefile-filehand, buffer, numbytes, written, NULL)) { +rv = WriteFile(thefile-filehand, buffer, numbytes, + written, NULL); +if (rv == ERROR_INSUFFICIENT_BUFFER numbytes (32*1024)) + rv = WriteFile(thefile-filehand, buffer, (DWORD)(32*1024), + bwrote, NULL); + +if (!rv) rc = apr_get_os_error(); thefile-filePtr += written; break; Bill, Jonathan, could the two of you agree on this patch? Joe? If necessary, I can provide a patch for backporting to 0.9.x too. Hope that helps. bye, Erik.
Re: [PATCH] Fix for Subversion issue #1789
(but note: I don't have a Windows dev environment!). Not having an environment to test shows: as it turns out, WriteFile returns a BOOL, not a DWORD, meaning that the patch below is much more applicable. I hope it's good enough now. (BTW, the patch requires the '-p2' parameter for patch(1) to apply correctly.) Index: ../apr/file_io/win32/readwrite.c === --- ../apr/file_io/win32/readwrite.c(revision 518377) +++ ../apr/file_io/win32/readwrite.c(working copy) @@ -310,15 +310,25 @@ thefile-pOverlapped-Offset = (DWORD)thefile-filePtr; thefile-pOverlapped-OffsetHigh = (DWORD)(thefile-filePtr 32); } -rv = WriteFile(thefile-filehand, buf, (DWORD)*nbytes, bwrote, - thefile-pOverlapped); + +if (!(rv = WriteFile(thefile-filehand, buf, (DWORD)*nbytes, + bwrote, thefile-pOverlapped)) + GetLastError() == ERROR_INSUFFICIENT_BUFFER + *nbytes (30*1024)) + rv = WriteFile(thefile-filehand, buf, (DWORD)(30*1024), bwrote,+ thefile-pOverlapped); + if (thefile-append) { apr_file_unlock(thefile); apr_thread_mutex_unlock(thefile-mutex); } } else { -rv = WriteFile(thefile-filehand, buf, (DWORD)*nbytes, bwrote, + if (!(rv = WriteFile(thefile-filehand, buf, (DWORD)*nbytes, bwrote,+ thefile-pOverlapped)) + GetLastError() == ERROR_INSUFFICIENT_BUFFER + *nbytes (30*1024)) +rv = WriteFile(thefile-filehand, buf, (DWORD)(30*1024), bwrote, thefile-pOverlapped); } if (rv) { @@ -478,6 +488,8 @@ bytesleft = thefile-bufpos; do { +DWORD rv; + if (bytesleft APR_DWORD_MAX) { numbytes = APR_DWORD_MAX; } @@ -485,7 +497,14 @@ numbytes = (DWORD)bytesleft; } -if (!WriteFile(thefile-filehand, buffer, numbytes, written, NULL)) { +if (!(rv = WriteFile(thefile-filehand, buffer, numbytes, + written, NULL)) + GetLastError() == ERROR_INSUFFICIENT_BUFFER + numbytes (30*1024)) + rv = WriteFile(thefile-filehand, buffer, (DWORD)(30*1024), + bwrote, NULL); + +if (!rv) rc = apr_get_os_error(); thefile-filePtr += written; break; bye, Erik.
Fwd: Win32 build files and eol-style
Sorry, ended up sending my reply to Mladen personally instead of the lists. bye, Erik. -- Forwarded message -- From: Erik Huelsmann [EMAIL PROTECTED] Date: Nov 14, 2005 1:10 PM Subject: Re: Win32 build files and eol-style To: Mladen Turk [EMAIL PROTECTED] On 11/14/05, Mladen Turk [EMAIL PROTECTED] wrote: Colm MacCarthaigh wrote: I'd like to turn the svn:eol-style attribute off for the windown build files (files ending in .dsp, .dsw and win32ver.awk), and have them stored in win32 new-line format in the repository. Any objections? I'm not sure you can use the .dsw and .dsp file outside windows, and on windows the files are already in CRLF when checked out on that platform. IMO any .zip source distro should be made with CRLF line endings for all non-binary files, so again no problem there. There is even a perl script (lineends.pl) for those that wish to create CRLF distros on non-windows platform. I see no real reason why they should be forced to have the CRLF line endings in all cases. So, yes, I think it's useless to enforce something like that. A reason to set the eol-style to CRLF is that *if* someone edits them on unix and accidentally inserts LFs, they're forcibly recoded to CRLF upon commit. Which -obviously- doesn't happen if you don't set an eol-style. bye, Erik.
Re: [PATCH] Fix loop termination in apr_file_write
On 8/20/05, Joe Orton [EMAIL PROTECTED] wrote: I promissed Karl Fogel a reproduction recipe, but didn't succeed in creating one. OTOH, I think it's clear the line needs changing... I've committed this with a test case (trying to write to a file which was opened read-only), thanks Erik. Welcome. Given that this is not only a problem for Subversion clients, but also for mod_dav_svn and svnserve servers [which are short on /tmp disc space for example], I'd like to ask for a backport to 0.9.x and a 0.9.7 release (somewhere in the not-too-far-off future). bye, Erik.
[PATCH] Fix loop termination in apr_file_write
Sometimes Subversion ends in an unterminated loop when checking out to a full device. Looking at the Windows and OS/2 code, the unix code seems to be missing a line which changes the loop termination variable rv. I promissed Karl Fogel a reproduction recipe, but didn't succeed in creating one. OTOH, I think it's clear the line needs changing... bye, Erik. PS: it's a patch against trunk. I can supply one for all branches, because I would very much like it to be backported to 0.9.x, but the change is so trivial I think using merge is faster Index: file_io/unix/readwrite.c === --- file_io/unix/readwrite.c(revision 233542) +++ file_io/unix/readwrite.c(working copy) @@ -170,7 +170,7 @@ rv = 0; while (rv == 0 size 0) { if (thefile-bufpos == APR_FILE_BUFSIZE) /* write buffer is full*/ -apr_file_flush(thefile); +rv = apr_file_flush(thefile); blocksize = size APR_FILE_BUFSIZE - thefile-bufpos ? APR_FILE_BUFSIZE - thefile-bufpos : size;
Re: confusion about largefile support
On 5/31/05, Ben Collins-Sussman [EMAIL PROTECTED] wrote: On May 31, 2005, at 11:49 AM, Ben Collins-Sussman wrote: Funny, KDE is using fsfs, and I would have expected them to run into a 2GB revision file. Well, whattya know. Now Timothee Besset (ttimo) in IRC has just reported the same File size limit exceeded error that we saw on users@ earlier today. In both cases, the users were loading a dumpfile into an fsfs repository. And ttimo verified my fear. There's a 2GB file being assembled in db/txns/. So, um, maybe we should write a FAQ? One which tells folks that the only workaround here is to recompile subversion against apr 1.x? (And to upgrade to httpd 2.1 if necessary.) or use a BDB repos. bye, Erik.
Re: Move apr_xlate from apr-util to apr-iconv?
On 5/30/05, Mladen Turk [EMAIL PROTECTED] wrote: Jeff Trawick wrote: Can we move the entire apr_xlate API from apr-utils to the apr-iconv? no; apr_xlate works just fine without apr-iconv (when system provides a suitable iconv) It does not. Did you read what he said? when the system provides a suitable iconv: Win32 does not provide such a suitable iconv. On my Linux system (which does provide a suitable iconv), I don't need to compile apr-iconv to use apr_xlate. I guess it does, then. bye, Erik.
Win32: Ruby APR; build problems for Ruby Subversion SWIG bindings
Having taken upon me the task to provide a Windows build for Subversion, I run into a problem with the Ruby SWIG bindings. The problem is with APR and Ruby Win32 header files though. For those who don't know any of the terms used: Subversion: the next generation of centralised version control (http://www.tigris.org/; down until probably somewhere tomorrow) SWIG: a plugin generator for programming languages (http://www.swig.org) Ruby: a programming language (http://www.ruby-lang.org) APR: the Apache Portable Runtime (http://apr.apache.org) upon which Subversion has been built Ok. So now for the real problem: Windows does not define pid_t, gid_t or uid_t in any of the system headers. Both APR and Ruby *do* define these 3 types: in apr.hw (renamed to apr.h during windows build): typedef int pid_t typedef int gid_t typedef int uid_t in win32.h from the ruby-mswin32-1.8.2 zip: #define pid_t int #define gid_t int #define uid_t int Normally I would resolve this problem by including apr.h first, then followed by ruby.h; however, I use SWIG to generate the wrappers and SWIG includes ruby.h first. Now the compiler is generating errors that there are lines like this in the sources: typedef int int which is ofcourse absolute non-sense. Because I want to build to distribute, I don't want to modify any of the headers involved. I do need some help getting out of this mess though! bye, Erik.