lostluck commented on a change in pull request #15896: URL: https://github.com/apache/beam/pull/15896#discussion_r744939632
########## File path: sdks/go/pkg/beam/util/hooksx/cache.go ########## @@ -0,0 +1,40 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package hooksx defines user-facing entrypoints into Beam hooks, providing +// input validation and removing potential errors arising from typos in hook +// names. This also allows hook failures to occur earlier when the hook is +// enabled, not when the hooks are called. All hooks that accept input from +// users should have hooksx functions associated with them. +package hooksx + +import ( + "fmt" + "strconv" + + "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks" +) + +// EnableSideInputCache accepts a desired maximum size for the side input cache, validates that +// it is non-negative, then formats the integer as a string to pass to EnableHook. A size of 0 +// disables the cache. This hook is utilized in pkg/beam/core/runtime/harness/harness.go when the +// cache is initialized. +func EnableSideInputCache(size int64) error { Review comment: Lets go with "capacity" for the parameter name, as it does show up in documentation, and is clearer that it's not going to be a byte amount or similar vs `size`. Go typically eshews extra words like enable, get, set, if it can get away with them, so we can just call this `SideInputCacheCapacity` as a method instead. The documentation can clarify that a non-zero value enables the cache (as it already does). ########## File path: sdks/go/pkg/beam/util/hooksx/cache.go ########## @@ -0,0 +1,40 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package hooksx defines user-facing entrypoints into Beam hooks, providing +// input validation and removing potential errors arising from typos in hook +// names. This also allows hook failures to occur earlier when the hook is +// enabled, not when the hooks are called. All hooks that accept input from +// users should have hooksx functions associated with them. +package hooksx Review comment: The more I look at "hooksx" the less I like it. 1. The triple consonant "ksx" is not great to pronounce. 2. hooksx says what the package contains, but not necessarily what they're for. Perhaps we should change it to `harnesshooks` or `harnessx` (though, there's that `sx` again). It would be clearer that these are hooks intended to modify the harness. `harnessopts` since they're options for the harness? What do you think? Bringing in the harness association will also make the package description clearer and more useful for users. ########## File path: sdks/go/pkg/beam/core/runtime/harness/cache_hooks.go ########## @@ -0,0 +1,51 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package harness + +import ( + "context" + "fmt" + + "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks" +) + +var ( + cacheSize int = 0 +) + +func init() { + hf := func(opts []string) hooks.Hook { + return hooks.Hook{ + Init: func(ctx context.Context) (context.Context, error) { + if len(opts) == 0 { + return ctx, nil + } + if len(opts) > 1 { + return ctx, fmt.Errorf("expected 1 option, got %v: %v", len(opts), opts) + } + + var count int + _, err := fmt.Sscan(opts[0], &count) + if err != nil { + return nil, err + } + cacheSize = count + return ctx, nil + }, + } + } + hooks.RegisterHook("EnableSideInputCache", hf) Review comment: Lets go with something more Urn-like. Ideally we don't want to crowd the global "namespace" too much, and SDK based hooks should be . "beam:go:hook:sideinputcache:capacity" Says it's a beam thing, for the Go SDK, for a hook, and specifically configuring the cache with an capacity count. Capacity usually binds to an integer value, while size could also refer to memory, so it's good to be specific. This gives us the option to replace it with a different hook if we can figure out the memory size limit thing later. While also letting us change the encoding format for the configuration options later too (by introducing a new URN). Just like other URNs, the urn defines the data format, in this case, just a string representation of the capacity. ########## File path: sdks/go/pkg/beam/util/hooksx/cache.go ########## @@ -0,0 +1,40 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package hooksx defines user-facing entrypoints into Beam hooks, providing +// input validation and removing potential errors arising from typos in hook +// names. This also allows hook failures to occur earlier when the hook is +// enabled, not when the hooks are called. All hooks that accept input from +// users should have hooksx functions associated with them. +package hooksx + +import ( + "fmt" + "strconv" + + "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks" +) + +// EnableSideInputCache accepts a desired maximum size for the side input cache, validates that +// it is non-negative, then formats the integer as a string to pass to EnableHook. A size of 0 +// disables the cache. This hook is utilized in pkg/beam/core/runtime/harness/harness.go when the Review comment: We don't need to write out the the implementation details in the description (eg the bit about formatting and calling EnableHook). That's the point of abstractions and encapsulation so users don't see the details. A user who cares can always look at the source for how it's written. We should have a comment in the code about where the actual hook code lives, but it's not going to be in the Go Doc, it should be above the `hooks.EnableHook` call. ########## File path: sdks/go/pkg/beam/util/hooksx/cache.go ########## @@ -0,0 +1,40 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package hooksx defines user-facing entrypoints into Beam hooks, providing +// input validation and removing potential errors arising from typos in hook +// names. This also allows hook failures to occur earlier when the hook is +// enabled, not when the hooks are called. All hooks that accept input from +// users should have hooksx functions associated with them. +package hooksx + +import ( + "fmt" + "strconv" + + "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks" +) + +// EnableSideInputCache accepts a desired maximum size for the side input cache, validates that +// it is non-negative, then formats the integer as a string to pass to EnableHook. A size of 0 +// disables the cache. This hook is utilized in pkg/beam/core/runtime/harness/harness.go when the +// cache is initialized. Review comment: We should call out a few more things in this description: 1. The cache is disabled by default. 2. That the cache requires runner support. ########## File path: sdks/go/pkg/beam/util/hooksx/cache_test.go ########## @@ -0,0 +1,57 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package hooksx + +import ( + "testing" + + "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks" +) + +// The true EnableSideInputCache hook is defined in harness.go in an init() +// function, so we mock it here to make sure the correct arg is being passed +// to the hooks package. +func createMockCacheHook() { + hooks.RegisterHook("EnableSideInputCache", func(opts []string) hooks.Hook { + return hooks.Hook{} + }) +} Review comment: Technically since it's in a test file, we should in principle be able to _ import the harness package into the tests here, which will run the `init()` blocks and avoid the mock entirely. This will let us actually check that the same hook name is being used. ########## File path: sdks/go/pkg/beam/util/hooksx/cache_test.go ########## @@ -0,0 +1,57 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package hooksx + +import ( + "testing" + + "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks" +) + +// The true EnableSideInputCache hook is defined in harness.go in an init() +// function, so we mock it here to make sure the correct arg is being passed +// to the hooks package. +func createMockCacheHook() { + hooks.RegisterHook("EnableSideInputCache", func(opts []string) hooks.Hook { + return hooks.Hook{} + }) +} + +func TestEnableSideInputCache(t *testing.T) { + createMockCacheHook() + err := EnableSideInputCache(1) + if err != nil { + t.Errorf("EnableSideInputCache failed when it should have succeeded, got %v", err) + } + ok, opts := hooks.IsEnabled("EnableSideInputCache") Review comment: We probably want to have the actual hook name string be a constant in this package, to avoid typo problems. Ideally it would be defined once and we use the constant for both sides, but we don't want to Export that from the packages. It will be sufficient that this test is validating they're the same. ########## File path: sdks/go/pkg/beam/util/hooksx/cache_test.go ########## @@ -0,0 +1,57 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package hooksx + +import ( + "testing" + + "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks" +) + +// The true EnableSideInputCache hook is defined in harness.go in an init() Review comment: ```suggestion // The true EnableSideInputCache hook is defined in harness/cache_hooks.go. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
