lostluck commented on a change in pull request #15896:
URL: https://github.com/apache/beam/pull/15896#discussion_r744939632



##########
File path: sdks/go/pkg/beam/util/hooksx/cache.go
##########
@@ -0,0 +1,40 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Package hooksx defines user-facing entrypoints into Beam hooks, providing
+// input validation and removing potential errors arising from typos in hook
+// names. This also allows hook failures to occur earlier when the hook is
+// enabled, not when the hooks are called. All hooks that accept input from
+// users should have hooksx functions associated with them.
+package hooksx
+
+import (
+       "fmt"
+       "strconv"
+
+       "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks"
+)
+
+// EnableSideInputCache accepts a desired maximum size for the side input 
cache, validates that
+// it is non-negative, then formats the integer as a string to pass to 
EnableHook. A size of 0
+// disables the cache. This hook is utilized in 
pkg/beam/core/runtime/harness/harness.go when the
+// cache is initialized.
+func EnableSideInputCache(size int64) error {

Review comment:
       Lets go with "capacity" for the parameter name, as it does show up in 
documentation, and is clearer that it's not going to be a byte amount or 
similar vs `size`.
   
   Go typically eshews extra words like enable, get, set, if it can get away 
with them, so we can just call this 
   `SideInputCacheCapacity` as a method instead.  The documentation can clarify 
that a non-zero value enables the cache (as it already does).

##########
File path: sdks/go/pkg/beam/util/hooksx/cache.go
##########
@@ -0,0 +1,40 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Package hooksx defines user-facing entrypoints into Beam hooks, providing
+// input validation and removing potential errors arising from typos in hook
+// names. This also allows hook failures to occur earlier when the hook is
+// enabled, not when the hooks are called. All hooks that accept input from
+// users should have hooksx functions associated with them.
+package hooksx

Review comment:
       The more I look at "hooksx" the less I like it. 
   
   1. The triple consonant "ksx" is not great to pronounce.
   2. hooksx says what the package contains, but not necessarily what they're 
for.
   
   Perhaps we should change it to `harnesshooks` or `harnessx`  (though, 
there's that `sx` again). It would be clearer that these are hooks intended to 
modify the harness. `harnessopts` since they're options for the harness? What 
do you think?
   
   Bringing in the harness association will also make the package description 
clearer and more useful for users.

##########
File path: sdks/go/pkg/beam/core/runtime/harness/cache_hooks.go
##########
@@ -0,0 +1,51 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package harness
+
+import (
+       "context"
+       "fmt"
+
+       "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks"
+)
+
+var (
+       cacheSize int = 0
+)
+
+func init() {
+       hf := func(opts []string) hooks.Hook {
+               return hooks.Hook{
+                       Init: func(ctx context.Context) (context.Context, 
error) {
+                               if len(opts) == 0 {
+                                       return ctx, nil
+                               }
+                               if len(opts) > 1 {
+                                       return ctx, fmt.Errorf("expected 1 
option, got %v: %v", len(opts), opts)
+                               }
+
+                               var count int
+                               _, err := fmt.Sscan(opts[0], &count)
+                               if err != nil {
+                                       return nil, err
+                               }
+                               cacheSize = count
+                               return ctx, nil
+                       },
+               }
+       }
+       hooks.RegisterHook("EnableSideInputCache", hf)

Review comment:
       Lets go with something more Urn-like. Ideally we don't want to crowd the 
global "namespace" too much, and SDK based hooks should be .
   
   "beam:go:hook:sideinputcache:capacity"
   
   Says it's a beam thing, for the Go SDK, for a hook, and specifically 
configuring the cache with an capacity count.  Capacity usually binds to an 
integer value, while size could also refer to memory, so it's good to be 
specific.
   
   This gives us the option to replace it with a different hook if we can 
figure out the memory size limit thing later.  While also letting us change the 
encoding format for the configuration options later too (by introducing a new 
URN). 
   
   Just like other URNs, the urn defines the data format, in this case, just a 
string representation of the capacity.

##########
File path: sdks/go/pkg/beam/util/hooksx/cache.go
##########
@@ -0,0 +1,40 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Package hooksx defines user-facing entrypoints into Beam hooks, providing
+// input validation and removing potential errors arising from typos in hook
+// names. This also allows hook failures to occur earlier when the hook is
+// enabled, not when the hooks are called. All hooks that accept input from
+// users should have hooksx functions associated with them.
+package hooksx
+
+import (
+       "fmt"
+       "strconv"
+
+       "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks"
+)
+
+// EnableSideInputCache accepts a desired maximum size for the side input 
cache, validates that
+// it is non-negative, then formats the integer as a string to pass to 
EnableHook. A size of 0
+// disables the cache. This hook is utilized in 
pkg/beam/core/runtime/harness/harness.go when the

Review comment:
       We don't need to write out the the implementation details in the 
description (eg the bit about formatting and calling EnableHook). That's the 
point of abstractions and encapsulation so users don't see the details.  A user 
who cares can always look at the source for how it's written.
   
   We should have a comment in the code about where the actual hook code lives, 
but it's not going to be in the Go Doc, it should be above the 
`hooks.EnableHook` call.

##########
File path: sdks/go/pkg/beam/util/hooksx/cache.go
##########
@@ -0,0 +1,40 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Package hooksx defines user-facing entrypoints into Beam hooks, providing
+// input validation and removing potential errors arising from typos in hook
+// names. This also allows hook failures to occur earlier when the hook is
+// enabled, not when the hooks are called. All hooks that accept input from
+// users should have hooksx functions associated with them.
+package hooksx
+
+import (
+       "fmt"
+       "strconv"
+
+       "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks"
+)
+
+// EnableSideInputCache accepts a desired maximum size for the side input 
cache, validates that
+// it is non-negative, then formats the integer as a string to pass to 
EnableHook. A size of 0
+// disables the cache. This hook is utilized in 
pkg/beam/core/runtime/harness/harness.go when the
+// cache is initialized.

Review comment:
       We should call out a few more things in this description:
   1. The cache is disabled by default.
   2. That the cache requires runner support.

##########
File path: sdks/go/pkg/beam/util/hooksx/cache_test.go
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package hooksx
+
+import (
+       "testing"
+
+       "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks"
+)
+
+// The true EnableSideInputCache hook is defined in harness.go in an init()
+// function, so we mock it here to make sure the correct arg is being passed
+// to the hooks package.
+func createMockCacheHook() {
+       hooks.RegisterHook("EnableSideInputCache", func(opts []string) 
hooks.Hook {
+               return hooks.Hook{}
+       })
+}

Review comment:
       Technically since it's in a test file, we should in principle be able to 
_ import the harness package into the tests here, which will run the `init()` 
blocks and avoid the mock entirely. This will let us actually check that the 
same hook name is being used.

##########
File path: sdks/go/pkg/beam/util/hooksx/cache_test.go
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package hooksx
+
+import (
+       "testing"
+
+       "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks"
+)
+
+// The true EnableSideInputCache hook is defined in harness.go in an init()
+// function, so we mock it here to make sure the correct arg is being passed
+// to the hooks package.
+func createMockCacheHook() {
+       hooks.RegisterHook("EnableSideInputCache", func(opts []string) 
hooks.Hook {
+               return hooks.Hook{}
+       })
+}
+
+func TestEnableSideInputCache(t *testing.T) {
+       createMockCacheHook()
+       err := EnableSideInputCache(1)
+       if err != nil {
+               t.Errorf("EnableSideInputCache failed when it should have 
succeeded, got %v", err)
+       }
+       ok, opts := hooks.IsEnabled("EnableSideInputCache")

Review comment:
       We probably want to have the actual hook name string be a constant in 
this package, to avoid typo problems. Ideally it would be defined once and we 
use the constant for both sides, but we don't want to Export that from the 
packages. It will be sufficient that this test is validating they're the same.

##########
File path: sdks/go/pkg/beam/util/hooksx/cache_test.go
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package hooksx
+
+import (
+       "testing"
+
+       "github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/hooks"
+)
+
+// The true EnableSideInputCache hook is defined in harness.go in an init()

Review comment:
       ```suggestion
   // The true EnableSideInputCache hook is defined in harness/cache_hooks.go.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to