MOTIVATION:
Meet Bob. Bob is a developer with mostly C++ and Java experience, but who has
been learning Swift. Bob needs to write an app to parse some proprietary binary
data format that his company requires. Bob’s written this app, and it’s worked
pretty well on Linux:
import Foundation
do {
let url = ...
let handle = try FileHandle(forReadingFrom: url)
let bufsize = 1024 * 1024 // read 1 MiB at a time
while true {
let data = handle.readData(ofLength: bufsize)
if data.isEmpty {
break
}
data.withUnsafeBytes { (bytes: UnsafePointer<UInt8>) in
// do something with bytes
}
}
} catch {
print("Error occurred: \(error.localizedDescription)")
}
Later, Bob needs to port this same app to macOS. All seems to work well, until
Bob tries opening a large file of many gigabytes in size. Suddenly, the simple
act of running the app causes Bob’s Mac to completely lock up, beachball, and
finally pop up with the dreaded “This computer is out of system memory”
message. If Bob’s particularly unlucky, things will locked up tight enough that
he can’t even recover from there, and may have to hard-reboot the machine.
What happened?
Experienced Objective-C developers will spot the problem right away; the
Foundation APIs that Bob used generated autoreleased objects, which would never
be released until Bob’s loop finished. However, Bob’s never programmed in
Objective-C, and to him, this behavior is completely undecipherable.
After a copious amount of time spent Googling for answers and asking for help
on various mailing lists and message boards, Bob finally gets the
recommendation from someone to try wrapping the file handle read in an
autorelease pool. So he does:
import Foundation
do {
let url = ...
let handle = try FileHandle(forReadingFrom: url)
let bufsize = 1024 * 1024 // read 1 MiB at a time
while true {
let data = autoreleasepool { handle.readData(ofLength: bufsize) }
if data.isEmpty {
break
}
data.withUnsafeBytes { (bytes: UnsafePointer<UInt8>) in
// do something with bytes
}
}
} catch {
print("Error occurred: \(error.localizedDescription)")
}
Unfortunately, Bob’s program still eats RAM like Homer Simpson in an
all-you-can-eat buffet. Turns out the data.withUnsafeBytes call *also* causes
the data to be autoreleased. What Bob really needs to do is to wrap the whole
thing in an autorelease pool, creating a Pyramid of Doom:
import Foundation
do {
let url = ...
let handle = try FileHandle(forReadingFrom: url)
let bufsize = 1024 * 1024 // read 1 MiB at a time
while true {
autoreleasepool {
let data = handle.readData(ofLength: bufsize)
if data.isEmpty {
break // error: ‘break’ is allowed only inside a loop, if, do,
or switch
}
data.withUnsafeBytes { (bytes: UnsafePointer<UInt8>) in
// do something with bytes
}
}
}
} catch {
print("Error occurred: \(error.localizedDescription)")
}
However, when Bob tries to run this, he now gets a compile error on the ‘break’
statement; it’s no longer possible to break out of the loop, since everything
inside the autorelease block is in a closure.
Bob is now regretting his decision not to become an insurance adjuster instead.
Bob’s problem, of course, can be solved by using *two* autorelease pools, one
when getting the data, and the next when working with it. But this situation is
confusing to newcomers to the language, since autorelease pools are not really
part of Swift’s idiom, and aren’t mentioned anywhere in the usual Swift
documentation. Thus, without Objective-C experience, autorelease-related issues
are completely mysterious and baffling, particularly since, as a struct, it
isn’t obvious that Objective-C will be involved at all when using the Data
type. Even to experienced Objective-C developers, autorelease pools in Swift
can become awkward since, unlike with Objective-C, they can’t simply be tacked
onto a loop without losing flow control via break and continue.
PROPOSED SOLUTION:
In the Foundation overlay, wrap calls to Objective-C NSFileHandle and NSData
APIs that generate autoreleased objects in an autorelease pool, so that they
behave the way a user new to the language would expect, and in a manner
consistent with how they likely behave on other platforms which lack the
Objective-C bridge.
This would likely add a small performance overhead, but this should be
negligible compared to the overhead involved in reading from the disk which
will occur when using a FileHandle. In addition, if Data objects are being
accessed frequently enough for performance to be an issue, it’s likely that
enough of them to be generated to make memory overhead an issue if an
autorelease pool is not used.
IMPACT ON EXISTING CODE:
Code that currently works around these issues with an autorelease pool may end
up double-wrapping until these manual workarounds are removed.
Charles
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution